I received an email from another blogger, whose blog I had noticed via a trackback ping - Saheli, of Musings, and Observations had collected several responses to the NYTimes story in question and had some good questions (and, of course, musings and observations) about them. So, um, hi Saheli! I think I'm supposed to wax witty now.
Which is a perfect time to describe why this post sits in the 'Gear' category. It concerns something that's nagged at me for quite some time, but which I've never been able to decide whether or not is a) practical and b) necessary. Oh, and c) the Right Way To Go About It (quoth Pooh). As I was reading about the edited (read: deleted) scenes from the protest videos which were being used as evidence against the gentleman in the story, I was reminded of this thought which has kicked around in my head for a long, long time.
I keep wondering if there is a way to embed a pseudorandom stream of some sort into a frame-based media stream (like, say, video) which can be used to prove the integrity of the document at a later time. This is fraught with all manner of problems which I can think of right off the bat, and I'm not knowledgeable enough to prove that they can be fixed - plus, I have been convinced by people smarter than I about such matters that when doubt exists, be very very cautious. But still, the general idea seems like a good one to me - and since the basic premise is not one of exclusion or defense but of verification, I continue to pursue it.
Here are the essentials. In the story, a man arrested by the NYPD during the Republican National Convention was stated by the arresting officer under oath to have been acting in violent and generally non-police-approved ways. Videotapes were shown to the court which seemed to if not prove at least not contradict the officer's story. However, at a later date, a researcher came across another videotape of the same events (made by an amateur cameraperson) which, on comparison, clearly showed that the tape presented to the court as an unmodified record of events had, in fact, been modified - and scenes had been deleted which showed the defendant acting in a non-threatening, calm manner as he claimed he had been. The charges were dropped hurriedly, and some excuse regarding 'the wrong scenes being cut by a technician' was offered. Without going into the responsibilities of the prosecution to ensure their case wasn't tainted by such oopses, or the need for investigation into the now-contradicted testimony under oath of the officer (Saheli visits this on her blog) I'd like to talk about that videotape.
I mentioned that having the second videotape around was a demonstration of one way of citizens keeping a check on overreach by government agents (be they police, federal agents, or simply overzealous traffic enforcers). The prevalence of active video cameras at the protests surrounding the RNC made the existence of a second video record of the events in question likely enough to be worth searching for. Without an actual surveillance society (and again, I'll leave the arguments about whether we have one already or not for another time, hello Mr. Orwell and Mr. Bentham) whose records would, in any case, be useless to the average citizen, the dispersal of recording gear among the citizenry and the habitual practice of using said gear whenever the machinery of government acts would seem like a valuable backstop for a civil rights society.
Man, I can be longwinded, can't I? All this to lead back to a technological fantasy. Okay. In any case, recent times have shown also that the problem with camcorders is that the output of a camcorder can be very easily faked - and even more easily and perhaps worse than faked, simply cut or edited to produce a markedly different outcome simply by rearranging the order of events on the record - or just dropping some of them. In the more difficult cases, inserting events or scenes into the record might be done to change the impact of what is shown.
How can the value of the 'third view' (first being eyewitness, second being official surveillance) be preserved? Enter the whole point of this post, only umpteen paragraphs down. Using a method blending, say, the elements of SMPTE, GPS, and MD5 it might be possible to create a 'verification track' on a standard video data stream. This verification track would be encoded as part of the normal video data, thus ensuring it would not require any special equipment or modification - in fact, it would ideally be part of the running video feed, with a visible component verifying its presence. The visible pattern would need to be some form of known progression, in an unobtrusive place (like a station ID bug currently used by many broadcasters) whose transformations could be observed over time during playback and compared to a known, reference signal. Any deviation would signal a break (hence edit) in the original master.
Of course, the verification graphic could simply be added to the final edited cut of the video. As a result, the actual graphic should serve to present a datastream which can be captured and decoded during playback, if necessary at low resolution through a video monitor. This datastream would contain something like the following:
This would serve, at later dates, as a validation of the videostream. The timecodes and GPS coordinates, in addition to being useful for continuity data, would allow viewers later to determine the camera position and event timing for evidential purposes (or plain curiousity). The pseudorandom number stream, which is the key to the whole thing, serves two primary purposes: integrity verification and security. The PRN is the output of a Pseudo-random Number Generator. Given a starting point (a 'seed number') a PRNG will produce a stream of what appear to be random numbers. However, if you give the same PRNG the same seed, it will always produce the same stream; furthermore, unless you know the precise PRNG used and its seed state, no matter now much of its output you have in your hand you shouldn't be able to extrapolate what the next result will be.
What this means to us is that if we use this PRN stream to hash and encrypt the 'frames' of our verification data, it becomes very very hard for anyone to edit our video stream undetected. Assume that, during playback, the verification system is present. It is given the 'seed' password it was given during the filming of the video, and the video is set running. The verification system is busily comparing the contents of the verification frames as the video goes along; it's mostly concerned with the value of the PRN stream it's generating itself as compared to the one extracted from the video source. Suddenly, we hit a tape edit- and the stream is broken. A discrepancy shows up. The point is that it would be extremely difficult to manufacture the verification frames necessary to 'fill in the gaps' for a missing or inserted frame, because (presumably) the editor wouldn't have the password. Even if they do have the password, they would still have to find a way for their newly generated stream to hash properly with the now-changed PRN stream - unless they are exceedingly good at math, or their forgery is precisely the same number of frames long, they're going to have trouble making sure the PRN stream comes out right. They would have to modify every PRN frame for the remainder of the video. That's not in itself a real problem, but now every verification frame needs to be hashed again for video data checksumming (perhaps a color balance checksum, or some other means of 'fingerprinting' the frame) as well as having its timecode and GPS position data forged. The problem becomes much, much larger.
This is not a solution. It is not even a proposal for one, really. Mr. Schneier and his colleagues are expert at pointing out holes in security plans like this, which is why I'm posting it, I suppose. I'm not so much interested in negative-proof-by-counterexample, because I can spin scenarios to beat the thing of varying likelihood. What I'm looking for is fundamental problems with the approach. I'm sure they're there. What are they? You tell me.
There are the obvious ones. Password, JB? snort. Password? Yeah, yeah, I know. But for 'password' insert 'secure token of choice.' No, none of them are perfect. But remember. the point is not to make One Perfectly Provable Video. We're hoping and assuming that there will be many cameras. We're trying to raise the bar for monkeying with video evidence high enough that simple editing out of scenes doesn't just 'go unnoticed until another tape is found.' There will always be a way for someone technologically savvy enough to beat any technological system of protection; I accept that maxim. The question is, how expensive is it for them to do so in terms of time and resources? Once you know that, compare that cost to your target. The target here is not the dedicated, determined forger with access to corporate and NSA-style computational resources. The target here is the casually overzealous prosecutor; the harried policeman who wants to cut corners; the angry rent-a-cop with a surveillance camera; the unscrupulous media consultant at a political protest. You're trying to make your multiple handcam records more believable in a world of lust and crime.
Any of this make sense?
How? Oh, hell, I dunno. Perhaps a small addon box with a GPS in it that plugs into the camcorder via Firewire. Maybe a feature on the camcorder itself. Perhaps the camcorder feed goes out the DV slot and onto a HD recorder which adds this track live - an opportunity for the wearable linux hackers. The idea's the thing, with which to catch the excesses of the king.
God, what an awful trampling of a quote.
Posted by jbz at April 17, 2005 2:00 AM