Notifications

Clear all

Why AI upscaling doesn't fix your old tapes

Site Articles - Video

Last Post by marshalleq 3 days ago

1 Posts

2 Users

0 Reactions

11 Views

RSS

marshalleq

(@marshalleq)

Member Admin

Joined: 1 year ago

Posts: 25

Topic starter 28 May 2026 17:31

Most family-archive projects arrive at the same question eventually. You've captured the tape, you've watched it back, you can see the soft picture and the speckles, and somewhere along the way you've read about an AI tool that promises to turn standard-definition video into something approaching 4K. Topaz Video AI is the usual name. The question is whether you can just run your tapes through that and skip the rest.

The honest answer is mostly no — but the nuance matters more than either the marketing or the purist forums make out. AI tools do have a real place in this work. The place tends to be narrower than the marketing suggests and broader than the purists allow, and where it sits depends on what you mean by “your tapes” and which copy you're talking about. So this is the version with the nuance kept in.

New to any of these terms? The glossary of terms is the right place to start.

The shape of the answer up front

If you want the one-paragraph summary before the long read, here it is. For an archive master — the file you're hoping will outlive your hard drives and your hobby and possibly you — AI upscaling is the wrong tool, for reasons that come down to the way the model is built and the way the source is built. For a viewing copy — the file you actually watch, share, or upload — AI has a few narrow, useful applications, and a much larger number of expensive ways to make things worse. The rule that follows from that, and which the rest of this article is essentially an unpacking of, is the two-master discipline: capture cleanly, keep the original untouched, and treat any AI processing as something you do to a copy of that original, never to the original itself.

Why analog tape is the wrong test for AI upscaling

The thing worth understanding about an AI upscaler is what it was trained on. In the published research and in most open-source models — Real-ESRGAN, BasicVSR, the academic super-resolution literature — somebody takes a large collection of high-resolution digital masters (modern films, broadcast HD, photography), runs each one through an artificial degradation step (shrink to a lower resolution, add some compression noise, blur it slightly), and feeds the model billions of paired examples: degraded input on one side, clean original on the other. The model's entire job is to learn to reverse that specific artificial degradation. It's good at it, because that's what it has seen. Commercial tools like Topaz Video AI don't publish their training data, so the picture is less certain at that end of the market — but the community's experience with Topaz on VHS is consistent with what you'd expect from a model that hasn't seen much real tape in training: it treats the format's characteristic artefacts as image content rather than recognising them as part of the medium.

Now consider what comes off a VHS tape. The chroma signal that was recorded down at around 629 kHz, with usable bandwidth in the region of 400 kHz — roughly an order of magnitude less than the luminance gets. The interlacing. The white band of head-switching noise at the bottom of every frame. Chroma that doesn't sit on top of luminance quite right. Ghosting where the FM modulation has smeared a high-contrast edge. Sparkle and grain that move from one play to the next. Dropouts where the head briefly lost the carrier. None of these are the degradation the model was trained on. From the model's point of view they aren't degradation at all — they're image content, because they look unlike anything in the training pairs.

So the model upscales them. Faithfully. It treats the head-switch band as detail, the chroma fringing as detail, the sparkle as detail, and inflates all of it to whatever target resolution you asked for. The result looks plausible at a thumbnail. At full resolution it falls apart, because what you're looking at is a high-resolution rendering of the things you were hoping to remove.

There's a deeper version of this argument too, and it sits underneath everything else in this article. A VHS recording at roughly 240 lines of horizontal resolution for NTSC, or 250 for PAL — 240 to 250 TVL, television lines, in the analog-video shorthand — is the original at that resolution. The tape didn't fail to capture some higher-resolution master; the tape was the master at that resolution, and the analog signal stops there. Film, by contrast, has physical grain finer than most scans of it, and a higher-resolution scan of the same negative recovers real detail that was always there. Tape has no such reserve. There is nothing on the tape that 240 TVL of resolution was insufficient to capture. The implication is the one most people resist when they first encounter it: any pixel an AI upscaler invents above the tape's native resolution is invented. Not recovered. Not enhanced. Made up, by a model, from a prior of what it thinks the picture probably should look like. That prior was trained on someone else's content.

Image-and-video super-resolution as a field calls this the one-to-many problem: there are many different high-resolution images that could produce the same low-resolution one when downscaled, and the model picks one. The pick can be sharp and convincing. It isn't your tape. For science, medicine, and forensic work, this property is a disqualifier — you can't have a model picking among possible originals when the question is what was actually there. For entertainment, where the brain happily papers over plausible detail and the goal is “looks good enough”, it can be acceptable. Knowing which side of that line you're on matters.

The formal preservation field has implied this for years without quite saying it directly. The IASA, NDSA, FADGI and AMIA documents talk about authenticity, signal integrity, and the duty to preserve the original as the original — but none of them, as of the time of writing, have published a position statement that explicitly addresses generative AI restoration applied to consumer analog tape. The institutional silence is itself data. The principles are there; the specific guidance for this specific question hasn't been written down yet. Until it is, the principles are what we have to work from.

What AI tools genuinely help with on tape material

I want to be specific here, because the broad-brush “AI bad” position misses real ground. There are tools and techniques on this list that I'd use without hesitation when the situation calls for them — usually on a viewing copy, often as part of a larger pipeline that isn't AI-first.

Frame interpolation. Of all the AI-class tools in current use, frame interpolation is the one with the least friction. The open-source RIFE tool, and Topaz's Chronos model in the commercial world, both do a creditable job of synthesising in-between frames — useful when smoothing a stuttery handheld pan, recovering a dropped frame, or doing a careful framerate conversion. Standards-conversion work (30→24, say) tends to look acceptable on low-motion material and worse the faster the motion gets; that's true for AI and non-AI tooling alike.

Dropout inpainting on viewing copies (experimental). ProPainter and similar inpainting tools can do a reasonable job of filling small dropouts on a per-field basis. The output is currently 8-bit RGB and the pipeline is fiddly — separate field handling, conversion back to YCbCr — and it absolutely belongs in the viewing-copy lane rather than the archive lane, because the filled pixels are the model's guess at what was there, not a recovery of what was there. For a tape you actually want to watch, it can be a real improvement on the bare dropout. For an archival master, the bare dropout is the more honest record.

Light denoise on already-clean captures. The temporal denoiser inside Topaz, set conservatively on a clean SD source that's already been deinterlaced, produces results roughly competitive with traditional motion-compensated denoisers (SMDegrain, KNLMeansCL) for the last layer of fine hiss. Two preconditions matter. The capture has to be clean to start with — feeding heavy VHS noise into AI denoise tends to emphasise the ringing and chroma artefacts rather than remove them, because the model treats the artefacts as detail. And the source has to be deinterlaced first; AI denoisers do badly on interlaced source because they can't tell which fields belong to which moment. If both preconditions are met, AI denoise is a usable choice. If either isn't, the traditional toolchain tends to be safer.

Neural Y/C separation inside the decode pipeline. Down at the lower levels of the vhs-decode toolchain, there are neural-network-trained models doing chroma separation — the work of pulling colour information cleanly away from luminance in composite-encoded sources. These aren't AI upscalers in the consumer sense; they're trained models doing a narrow signal-processing task on the demodulated data. For the family archivist, the relevant point is that “AI” as a label spans a wide range of things, and some of them are doing genuinely useful work that has nothing to do with hallucinating extra pixels.

Nnedi3 inside QTGMC. The default community deinterlacer, QTGMC, has used a small neural network (nnedi3) for spatial interpolation for years. It's so well-integrated and so well-behaved that the community largely doesn't think of it as “AI” — it's just part of how QTGMC works. By the strict definition it is AI; in practice it sits closer to a careful interpolator than to a generative model. The point is that neural-network-augmented processing has been in the standard pipeline for a long time, doing useful work, without controversy.

Animation and cartoon content. AI upscalers tend to do considerably better on flat-shaded animation than on live-action material. Hard lines and large areas of single-colour fill are exactly the kind of structure the training data is full of. If the tape is an animated cartoon rather than a family wedding, the same tool can produce a meaningfully better result. Community-trained models specifically for anime and cartoon LaserDisc upscaling have existed for years.

Where AI upscaling falls down on tape — the failure modes worth knowing about

The other side of the same coin. These are the cases where I'd be cautious, or wouldn't use the tool at all.

Generative upscaling on VHS live-action. This is the headline use case and the one most family archivists arrive at. It's also where the structural mismatch between trained model and tape source is most visible. Generative upscalers tend to produce a characteristic appearance: sharpened, defined outlines on objects, with the fine detail inside the outlines smoothed away. Edge structure is enhanced; texture is lost. On atmospheric content, the model will often interpret real haze or low-light noise as defect and clean it out, producing a “clear day” version of footage that was overcast or dim. That isn't restoration; it's a different recording.

Faces. Family video is largely faces, and faces are the category AI upscalers handle worst. The result is the familiar plastic / mannequin / Snapchat-filter appearance — smoothed, generic, oddly youthened. For a family wedding or a child's birthday, the entire emotional content of the footage lives in those faces, and the AI version replaces them with the model's idea of similar-looking faces. The likeness is approximate. Personally I find it unsettling. I'd avoid AI face restoration on any footage with people in it I care about, and that covers most family material by definition.

Deinterlacing as a headline AI feature. Topaz Video AI is marketed in part on its deinterlacing capability. If you peek behind the marketing and look at the actual command lines the tool emits, it turns out the deinterlacer being used internally is bwdif — a perfectly good open-source FFmpeg filter, but not AI in any meaningful sense. The AI branding gets attached to the upscale or denoise stage that runs after deinterlacing. Topaz isn't doing anything wrong by using bwdif; it just isn't doing what the marketing implies it's doing. For deinterlacing specifically, QTGMC remains the community default, and there isn't a strong AI-based reason to switch.

Recovering dropouts on an archive master. This is a viewing-copy operation, not a master operation. The data isn't there. The model invents something that looks like detail and pastes it over the gap. On a viewing copy that's a reasonable creative choice — you're making something to watch, not an evidential record. On the master, I'd argue it's worse than leaving the dropout visible, because at least a visible dropout is honest about being damage. A hallucinated dropout fill is damage pretending to be detail.

Chroma reconstruction in the general case. Some experimental work in the community uses AI tools to clean up the chroma signal specifically — process the colour channels through an AI denoiser, then recombine with the original luminance — and the early results are interesting. The technique is genuinely fiddly, requires careful chroma/luma separation in the pipeline, and is firmly experimental rather than something I'd offer to a family archivist as a recipe. The point is mostly that AI applied surgically to a narrow part of the signal chain is a more credible use than AI applied broadly to “the whole picture”.

Topaz Video AI specifically — what it actually is in 2026

Topaz Video AI is the tool family archivists most often arrive at because it's the most marketed and the most consumer-facing. A short, calm description of what it is at the time of writing.

It's commercial software, subscription-based. Topaz moved to a subscription model in October 2025 and perpetual licences are no longer being sold; at the time of writing, the personal tier sits at roughly US$25 per month or US$299 per year, with a higher-priced Pro tier around US$699 per year for users who need local execution of the heaviest models. Pricing changes; the specific numbers are a current-state snapshot rather than a permanent fact.

Topaz bundles several different models under its banner. The older models — Artemis, Proteus, Iris — are conventional convolutional super-resolution and denoise models, available locally on consumer GPUs. The newest model, Starlight, is a generative diffusion-based upscaler. Starlight is the one most relevant to the “fix my VHS” question because it's the model marketed for difficult source material. It's also the most expensive to run: at standard tier it's cloud-only (you upload, the work happens on Topaz's servers, you download), and at Pro tier it can run locally if you have a high-end NVIDIA GPU with substantial video memory. Practical throughput on local hardware sits in the range of fractions of a frame per second on consumer cards; on a top-end RTX 4090 the reported figure is around 1.4 frames per second. For an hour of source material, that's a long compute job.

The community verdict on Starlight specifically, distilled across a few years of practitioner observation: impressive in some cases, broken in others, with the failure mode being exactly the one the structural argument predicts. On a face it might do a reasonable job of upscaling the genuine detail that was present; on a flat area where there's no detail to recover, it invents texture that wasn't there. Atmosphere shifts. The mood of a scene changes. That's intrinsic to how generative diffusion models work, not a Starlight-specific bug.

It would be unhelpful to either recommend or anti-recommend Topaz in general. The right framing is what it's used for. On a clean, deinterlaced SD capture, treated as a viewing-copy enhancement step rather than a magic fix, Topaz can produce results some practitioners are happy with. Used as a one-button “restore my tape”, it produces the failure modes described above. The tool isn't the problem; the use case is.

It's also worth knowing that several open-source alternatives exist and are being actively developed — Real-ESRGAN models accessed through ChaiNNer, VapourSynth plugins, the newer SeedVR2 model from ByteDance, various community-trained models for specific content types. Some of these may surpass Topaz on specific tasks; the field moves fast and the comparisons aren't stable from one quarter to the next. If you find yourself interested in the tooling for its own sake, the open-source world is where most of the experimentation happens.

The two-master discipline

The single thing every named preservation authority agrees on is the two-master discipline. IASA TC-04 and TC-06, FADGI, NDSA, AMIA, the Library of Congress, the BFI, the NFSA, the Cinémathèque suisse, the Österreichische Mediathek — they all land in the same place, often using different language. The preservation master is the unprocessed, lossless capture of the source signal. Everything else is a derivative. Processing belongs in the derivative.

Translated to the family archivist's situation:

The master is the cleanest capture you can make, kept untouched. For an RF-based workflow it's the FLAC-compressed RF file plus the decoded FFV1 in MKV (see how vhs-decode actually works for what those mean and why). For a conventional CVBS or S-Video capture, it's the lossless FFV1 file. Either way the master is interlaced, full-resolution, untouched.

The viewing copy is a separate file derived from the master. It can be deinterlaced. It can be upscaled — by any technique, AI or otherwise. It can be denoised, frame-interpolated, encoded down to a manageable size, uploaded to YouTube or kept on a phone. None of this touches the master.

The asymmetry of the two-master rule is the whole point. The master is the thing you cannot easily remake — you'd need the deck, the tape (which is degrading), and the patience for another capture session. The viewing copy you can remake in an afternoon, from the master, any time you like. So you make the master once, properly, with as little processing as possible, and you treat every viewing-copy decision as reversible. Twenty years from now, when better tools genuinely suited to tape exist — whether that's a model trained on real captures, or something none of us has thought of yet — you'll pull the master off the shelf and re-derive a better viewing copy. If you'd applied today's best AI tool to the master and thrown away the original, you've permanently locked your archive to today's tooling. That tends to look like a worse trade with each passing year.

The community follows this discipline naturally — the practitioners most active in tape capture keep their RF files, their interlaced FFV1 masters, and separately their derived 4K viewing copies, without much discussion. The institutional literature codifies it formally. The article you're reading is mostly an attempt to make it explicit for family archivists, who arrive at the problem fresh and reasonably ask whether the shortcut works. The shortcut doesn't work; the long path with two masters does.

The YouTube 4K upscale — a quick note

A pattern worth surfacing because it confuses the question. Practitioners who work with tape captures routinely upscale their output to 4K before uploading to YouTube. From the outside this looks like an endorsement of high-ratio upscaling, and by extension a possible endorsement of AI upscaling. It isn't.

The reason for the 4K upscale is that YouTube allocates bitrate by stream resolution. A 1080p stream gets a relatively low bitrate budget and tends to crumble into macroblock noise on detailed material; the same source uploaded at 4K gets a much larger budget and survives the compression considerably better, even though the underlying detail is still SD. The upscale itself is almost always done with a conventional sharp resampler — Spline64 or Lanczos — not with an AI model. The point isn't to invent detail. The point is to fool YouTube's bitrate allocator into giving the upload more bandwidth.

So if you've watched a YouTube video of someone's VHS capture and it looks reasonably good, the cause is unlikely to be AI restoration. It's more likely a conventional sharp upscale to 4K, applied for the bitrate hack, with whatever processing was done sitting upstream of that step. The YouTube practice isn't an argument for AI upscaling either way; it's a platform quirk, with its own logic.

Where AI does sit in institutional archives

For completeness, because it's worth knowing that the formal field hasn't simply banned AI from the preservation workflow. Several named institutions and practitioner-grade tools use machine learning openly and routinely. The pattern is consistent: AI is used for description and detection, not for signal generation.

The Institut national de l'audiovisuel in France runs AI across millions of hours of archived material for speaker identification, shot detection, OCR and scene classification — generating metadata, not modifying video. The DVRescue toolkit (developed by MediaArea with the Moving Image Preservation of Puget Sound) uses pattern-recognition techniques to flag dropouts and tape defects, leaving the decision to repair or not to the operator. QCTools (Dave Rice and BAVC) algorithmically characterises video to surface artefacts; the tool deliberately does not modify the file. iZotope RX is widely used in AV archives for audio restoration, but exclusively on access copies, never on the preservation master.

The line the field has settled on, in the absence of formal AI-restoration guidance: AI for description, defect detection, and access-layer audio repair is accepted; AI for signal generation, super-resolution, or generative restoration of preservation masters is not endorsed by any named institutional authority I'm aware of. The Criterion Collection has disclosed AI-based noise reduction on some commercial restoration projects, with the underlying high-resolution scan preserved separately as the authoritative source. That's the closest the commercial world has come to an AI-restoration standard, and it sits comfortably inside the two-master discipline rather than displacing it.

What to do if your tapes matter

Reduced to a practical plan, what falls out of all this is short.

Capture the tape cleanly, once, to the best master format you can manage. The exact format depends on the workflow; the principle is the same either way. The vhs-decode explainer covers what RF-based capture is and when it's the right path; the hardware article covers the device choices in 2026. For a conventional pipeline, a TBC-corrected lossless FFV1 capture into MKV is the master format the community has settled on.

Keep the master untouched. Back it up. Don't run it through AI tools, don't deinterlace it, don't denoise it, don't upscale it. The master is the thing you preserve.

Derive a viewing copy from the master. This is the file you actually watch, share, and upload. Here AI tools are reasonable — for frame interpolation, for light denoise on a clean source, for dropout inpainting on small gaps if you're inclined to experiment, for a conservative upscale if you intend to view on a large screen. The viewing copy is by definition disposable. You can re-derive it from the master any time you change your mind.

If you find a tool that genuinely helps, save the AI-enhanced output alongside the master, never as a replacement for it. The question “can I run my tapes through Topaz” has a sensible answer: yes, on a copy. The question “is this my archive now” has a firmer one: no, the unprocessed master is.

The reason the two-master rule is the whole answer, rather than just one option among several, comes down to what these tools can and can't do. AI upscaling is invention. It adds things — pixels, frames, edges, textures — that weren't on the tape. Invented content can be plausible, attractive, sometimes preferable as a viewing experience. It is not a recovery of what was recorded. The line between processing (subtracting things that obscured the signal) and invention (adding things that weren't in the signal) is the line between work that's reasonable on a preservation master and work that belongs only in derivatives. Hold to that, and the rest of the question more or less takes care of itself.

What's next

How vhs-decode actually works is the layman explainer for the RF-capture pipeline that produces the cleanest possible archive master. Capture hardware in 2026 is the device-selection companion piece for anyone weighing the hardware side of a clean capture path.

This topic is also published as an article here »

Quote

Forum Jump:

Previous Topic

Community

Why AI upscaling doesn't fix your old tapes

The shape of the answer up front

Why analog tape is the wrong test for AI upscaling

What AI tools genuinely help with on tape material

Where AI upscaling falls down on tape — the failure modes worth knowing about

Topaz Video AI specifically — what it actually is in 2026

The two-master discipline

The YouTube 4K upscale — a quick note

Where AI does sit in institutional archives

What to do if your tapes matter

What's next

EVEN MORE NEWS

Designing a family archive that survives you – Part II: Preservation

Designing a family archive that survives you – Part I: Backups

Why AI upscaling doesn’t fix your old tapes

POPULAR CATEGORY