Avoiding time_increment_bits problem when encoding bad header MPEG4 videos to Ogg Theora

There is some debate going on lately about the migration of YouTube to [[HTML5]], and whether they (i.e. YouTube’s owner, Google) should support [[H.264]] or [[Theora]] as standard codecs for the upcoming <video> tag. See, for example, how the FSF asks for support for Theora.

The thing is, I discovered [[x264]] not so long ago, and I thought it was a “free version” of H.264. I began using it to reencode the medium-to-low quality videos I keep (e.g., movies and series). The resulting quality/file size ratio stunned me. I could reencode most material downloaded from e.g. p2p sources to 2/3 of their size, keeping the copy indistinguishable from the original with the bare eye.

However, after realizing that x264 is just a free implementation of the proprietary H.264 codec, and in the wake of the H.264/Theora debate, I decided to give Ogg Theora a go. I expected a fair competitor to H.264, although still noticeably behind in quality/size ratio. And that I found. I for one do not care if I need a 10% larger file to attain the same quality, if it means using free formats, so I decided to adopt Theora for everyday reencoding.

After three paragraphs of introduction, let’s get to the point. Which is that reencoding some files with [[ffmpeg2theora]] I would get the following error:

% ffmpeg2theora -i example_video.avi -o output.ogg
[avi @ 0x22b7560]Something went wrong during header parsing, I will ignore it and try to continue anyway.
[NULL @ 0x22b87f0]hmm, seems the headers are not complete, trying to guess time_increment_bits
[NULL @ 0x22b87f0]my guess is 15 bits ;)
[NULL @ 0x22b87f0]looks like this file was encoded with (divx4/(old)xvid/opendivx) -> forcing low_delay flag
Input #0, avi, from 'example_video.avi':
  Metadata:
    Title           : example_video.avi
  Duration: 00:44:46.18, start: 0.000000, bitrate: 1093 kb/s
    Stream #0.0: Video: mpeg4, yuv420p, 624x464, 23.98 tbr, 23.98 tbn, 23.98 tbc
    Stream #0.1: Audio: mp3, 48000 Hz, 2 channels, s16, 32 kb/s
  .

[mpeg4 @ 0x22b87f0]hmm, seems the headers are not complete, trying to guess time_increment_bits
[mpeg4 @ 0x22b87f0]my guess is 16 bits ;)
[mpeg4 @ 0x22b87f0]hmm, seems the headers are not complete, trying to guess time_increment_bits
[mpeg4 @ 0x22b87f0]my guess is 16 bits ;)
[mpeg4 @ 0x22b87f0]looks like this file was encoded with (divx4/(old)xvid/opendivx) -> forcing low_delay flag
    Last message repeated 1 times
[mpeg4 @ 0x22b87f0]warning: first frame is no keyframe

I searched the web for solutions, but to no avail. Usually pasting literal errors in Google yields good results, but in this case I only found developer forums where this bug was discussed. What I haven’t found is simple instructions on how to avoid it in practice.

Well, here it goes my simple solution: pass it through [[MEncoder]] first. Where the following fails:

% ffmpeg2theora -i input.avi -o output.ogg

the following succeeds:

% mencoder input.avi -ovc copy -oac copy -o filtered.avi
% ffmpeg2theora -i filtered.avi -o output.ogg

I guess that what happens is basically that mencoder takes the “raw” video data in input.avi and makes a copy into filtered.avi (which ends up being exactly the same video), building sane headers in the process.

Comments (3)

First DreamHost disappointment

I will simply copy&paste an e-mail interchange between [[DreamHost]] and me, with a few extra comments (some data substituted by “xxxxx”):

DreamHost:

Dear Iñaki,

Our system has noticed what seems to be a large amount of “backup/non-web” content on your account (#xxxxx), mostly on user “xxxxx” on the web server “xxxxx”.

Some of that content specifically is in /home/xxxxx (although there may be more in other locations as well.)

Unfortunately, our terms of service (http://www.dreamhost.com/tos.html) state:

The customer agrees to make use of DreamHost Web Hosting servers primarily for the purpose of hosting a website, and associated email functions. Data uploaded must be primarily for this purpose; DreamHost Web Hosting servers are not intended as a data backup or archiving service. DreamHost Web Hosting reserves the right to negotiate additional charges with the Customer and/or the discontinuation of the backups/archives at their discretion.

At this point, we must ask you to do one of three things:

* You can delete all backup/non-web files on your account.

* You can close your account from our panel at:
https://panel.dreamhost.com/?tree=billing.accounts
(We are willing to refund to you any pre-paid amount you have remaining, even if you’re past the 97 days. Just reply to this email after closing your account from the panel).

OR!

* You may now enable your account for backup/non-web use!

If you’d like to enable your account to be used for non-web files, please visit the link below. You will be given the option to be charged $0.20 a month per GB of usage (the monthly average, with daily readings) across your whole account.

We don’t think there exists another online storage service that has anything near the same features, flexibility, and redundancy for less than this, so we sincerely hope you take us up on this offer!

In the future, we plan to allow the creation of a single “storage” user on your account which will have no web sites (or email). For now though, if you choose to enable your account for backups, nothing will change (apart from the charges). If you want to enable backup/non-web use on this account, please go here:

https://panel.dreamhost.com/backups.cgi?xxxxxxxxxxx

If you choose not to enable this, you must delete all your non-web files by 2008-07-16 or your account will be suspended.

If you have any questions about this or anything at all, please don’t hesitate to contact us by replying to this email.

Thank you very much for your understanding,
The Happy DreamHost Backup/Non-Web Use Team

My answer:

Dear DreamHost Support Team,

I fully understand your point. Though apparently sensible, a detailed analysis shows that the policy you cite from the TOS makes little sense.

Right now I have a 5920 GB/month bandwidth limit, and a 540 GB disk quota in my account, both applied to web use. My current use in this regard is less than 4 GB disk space (0.7% of my quota), and my estimated bw use at the end of the present billing period will be around 0.2 GB (33 ppm (parts per million) of my current (and increasing) bw quota).

Now, on the other hand, I have some 50-100 GB of data (less than 20% of my disk quota!!) that I want to keep at the servers (for whatever private interest, that I do not need to disclose, but I will: backup and data sharing among my different PCs). Keeping this data up to date could cause between 1 MB and 1 GB worth of transfers per day (30 GB/month at most, or 0.5% of my bw quota).

All of the above raises some questions:

1) Why on Earth am I granted such a huge amount of resources that I will never conceivably use? Maybe just because of that: because I will never use them?

2) Why am I prevented of using my account in the only way that would allow me to take advantage of even a tiny part of those resources?

3) In what respect is the HD space and bw used up by a backup different from that used up by web content? Isn’t all data a collection of 0s and 1s? How can a Hosting Service, ISP, or any other provider of digital means DISCRIMINATE private data according to content?

4) Regarding the previous point, how is DH to tell if I simply move the backup dirs to the isilanes.org/ folder? I have to assume that if I make my backups visible through the web (which I can prevent with file permissions), then it makes them 100% kosher, since they become “web content” that I am allowed to host at DH?

It seems to me that you are renting me a truck to transport people, then frown at me if I take advantage of it to carry furniture. Moreover, you are advising me to keep the truck for people and rent small vans for the furniture.

[snip irrelevant part]

Believe me, I am willing to be a nice user. I just want to be able to use the resources I pay the way I need.

Iñaki

Their answer:

Hello Iñaki,

1) Why on Earth am I granted such a huge amount of resources that I will never conceivably use? Maybe just because of that: because I will never use them?

Some people will. Admittedly, very few do, but to be perfectly blunt, overselling is actually a vital part of our (and ANY) web host’s business model:

http://blog.dreamhost.com/2006/05/18/the-truth-about-overselling/

2) Why am I prevented of using my account in the only way that would allow me to take advantage of even a tiny part of those resources?

That’s an exaggeration, to be honest. Anyone can use up to the entire amount of their bandwidth and space, providing they use it for the purpose intended. If we ever open DreamStorage, you’d be welcome to use that space for backing up your data.

3) In what respect is the HD space and bw used up by a backup different from that used up by web content? sn’t all data a collection of 0s and 1s? How can a Hosting Service, ISP, or any other provider of digital means DISCRIMINATE private data according to content?

Well, just as we have…there’s a ton of data in a non-web-accessible directory. That’s a pretty good tip that something’s up. By your argument, we couldn’t take down someone for copyright, or even child porn violations, as it’s just “a collection of 0s and 1s”, and who are we to “discriminate”? Our Terms of Service, which you agreed to 2008-02-22 at 3:39pm. If you didn’t agree, this simply wasn’t the service for you.

4) Regarding the previous point, how is DH to tell if I simply move the backup dirs to the isilanes.org/ folder? I have to assume that if I make my backups visible through the web (which I can prevent with file permissions), then it makes them 100% kosher, since they become “web content” that I am allowed to host at DH?

Honestly, we’re not going to let you off on some weak technicality. If you don’t wish to comply with the ToS, we’ve even allowed you the option of receiving a prorated refund, regardless of how far out from your 97 day guarantee you are. We have no desire to lose your business, but your truck analogy is almost there. We’re offering you trucks for transporting furniture…and we’re doing it at a nice low rate. But we do require you actually use them. We count on the fact that very few people are going to be moving furniture 24/7, but if someone wanted to use it to it’s fullest, they could. However, that doesn’t mean you get to rent the truck, park it somewhere, and use it as a free self-storage unit. We want the truck if you’re not using it for it’s intended
purpose.

[snip irrelevant part]

Let me know if you have any other questions.

Thanks!

Jeff H

My final answer:

Hi Jeff,

Thanks for the kind answer! This kind of support is what gives DH an edge over other hosting providers. Keep it up.

What I say in my second point is not an exageration. It’s the plain truth: if not for backups, I will never use 1% of my quota. I mean *I* won’t. Don’t know about others, just me.

It seems a little unfair that some guy with 500 GB of HD use and 5800 GB/month of bw use is paying 8$/month as I am (I don’t recall the exact amount), while I am using 4 GB and 0.2 GB/month. Then I want to use 80 GB and 30 GB/month and I have to pay an extra 16$. That’s a total of TRIPLE that of the aforementioned guy, while I’m still using 6 times less HD and 200 times less bw.

I would love to pay for some resources, and administer them as I like, be it for web, backup, svn, or whatever. What I meant with my third point is that 100 MB of my backups “hurt” the system as much as sb else’s 100 MB of web content, so I can’t see the reason to make the user pay a separate bill for “backups”. Just make ftp traffic count against the disk/bw quotas and that’s it! You could then stop worrying about “fair” use.

But that’s pointless ranting on my side. Thanks for the attention. I will consider what to do in the light of the information you provided me.

Iñaki

I just want to point out how ridiculous their answer to my third point above is. DH tells me that they should be able to discriminate my data according to content (or use), because the opposite would supposedly allow me to break the law with copyright violations or child pornography. To follow with the truck metaphor, I am renting a truck from them, to carry furniture around. Since I don’t use up all the space in the truck, and I have a fridge I want to move, I put it into the truck. Now DH wants to patrol what I carry in the truck, and tell me that the fridge is not allowed, because it is not “furniture”. When I complain, and say that what I carry in the truck they lend me is none of their business, they answer that it is, because I could well be using the truck for drug smuggling. That’s really lousy reasoning. If I use the truck for carrying something illegal, then the police will sort it out, not the renting company. It is the general Law that will tell me what I can use the truck for, not the renting company.

Comments (11)

Creative Labs and the proprietary idiocy

Just when you thought that the world of proprietary software and silly “intellectual property” business couldn’t do it worse, they surprise you!

This weekend I learned about a message from a big boss at Creative Labs to an individual nicknamed daniel_k at some [[Creative Labs]] forum. Please follow the link to the message, because it is very interesting. And don’t forget to read some of the responses.

The short story is that Creative Labs produces some sound cards and their drivers. Apparently some of the drivers would not work for Windows Vista, and daniel_k managed to program drivers for Vista (and Linux, I think), and distributed them for free (asking only for voluntary donations). The result: an open message in a forum, asking daniel_k to [[cease and desist]].

The rationale for CL to do that seems to be that they didn’t release Vista drivers for the sound cards on purpose, so that customers would have to buy new cards if they switched to Vista. With daniel_k’s contributions, such customers are not forced to dump the old card for a new one, so this costs CL money!

Another example of absolutely vile acts from vendors of proprietary software (were the drivers [[free software]], this discussion would be moot), and one more reason to say fuck you all!

The good part is that the story is already being spread around the net, and a lot of customers and potential customers are becoming angry customers and potential customers. I wish CL the worst for their vileness and short sightedness on the issue. They should have supported daniel_k, and use the ensuing possitive feedback campaign… but they didn’t. Shame on you, Creative Labs!

Comments (1)

Me 0 – DreamHost 1

Yesterday evening I boldly decided to upgrade [[WordPress]] (the software this blog runs on), to version 2.5. [[DreamHost]], my hosting service, provides easy click-through installation and upgrades of software, so I used it for the upgrade.

Sadly, and probably for some mistake I did, everything ended up screwed, and my blog experienced some problems like not showing any post at all! I proceeded to contact the support team, and the response was awesome: they answered incredibly fast, and the solution was concise and correct.

I have to say that DreamHost has surprised me very positively!

Comments (1)