This is my discussion blog. The way it works is that when any pages in this wiki have a discussion page created for them, the discussion pages show up below. Also, any comments on my blog posts also show up here.

http://yarchive.net/comp/linux/utf8.html has a nice roundup of why filenames should be treated as a byte stream rather than data in any particular encoding.
Comment by David Friday evening, February 3rd, 2012
There are two characters to the left of GREEK SMALL LETTER IOTA WITH DIALYTIKA AND TONOS. but you only denote one.
Comment by Matthias Friday afternoon, February 3rd, 2012

Just take a look at the problems programs like git have, when running on windows. It has to deal with - at least - three different encodings: the user's locale, the console's output encoding and ucs2 for the filesystem. The latter might, or might not be automatically converted when accessing the filesystem, depending on the API used.

Filenames make only sense as raw bytestreams. Even if everything would be talking UTF-8, canonicalisation, combining characters, case(in)sensitivity and other stuff would make that so much harder as the same sequence of "characters" (in the broadest sense) can be represented by different bytestreams.

Comment by David Schmitt early Friday morning, February 3rd, 2012

I think "my program is special, all other programs can use the dumb interface" is the wrong conclusion.

Raw bytestream filenames is a fundamental property of Linux filesystems, and not something applications can form policy to get away from. Most applications that deal with files need to get filenames correct, all of the time, not most of the time.

Comment by ulrik.sverdrup early Friday morning, February 3rd, 2012
I have a situation where I've got one file on an EXT3 filesystem with UTF-8 encoding which has invalid characters. I try to ensure I use UTF-8 across all my filesystems but these things can still crop up. I think this file was inside a torrent and the torrent software didn't do conversion. Or perhaps I copied it onto the system via samba and that didn't.
Comment by Jon Dowland in the wee hours of Thursday night, February 3rd, 2012
I think that the root of the problem is that non-unicode filenames are possible at all in the underlying OS API. And they exist because this is how POSIX works. Until POSIX is eliminated (e.g. by making Microsoft Windows the only existing platform), you will have to do this regressive work.
Comment by Alexander late Thursday night, February 3rd, 2012
Huh, sure enough; looks like the wikipedia shortcut got broken in upstream ikiwiki in October 2011, and not fixed until a week ago, on January 25th.
Comment by Josh late Thursday evening, February 2nd, 2012

It seems that even with my own instance of ikiwiki (latest version, tracking sid/unstable), using the [[!wikipedia foo]] linking also gives me links to en.wikimedia.org, instead of to wikipedia.

I guess that this is what may be happening here, Josh.

Comment by rbrito late Thursday evening, February 2nd, 2012
Your link to Wikipedia seems broken; it links to en.wikimedia.org rather than en.wikipedia.org. You might consider using ikiwiki's shortcuts like [[!wikipedia instead, so that can't happen.
Comment by Josh Thursday evening, February 2nd, 2012
Hey Joey, thanks for sharing your information on using Google Voice on dial up. I am not very computer savvy so I was wondering if you could direct me to where I can get more information on setting up a shell account and creating the file that you mentioned. Thanks again for sharing the information on using Google Voice on dialup.
Comment by Scott at lunch time on Thursday, February 2nd, 2012

The best version scheme I've ever found is the old-fashioned X.Y.Z ... BUT with these very strict rules:

  • X is incremented on any backwards-incompatible change compared to previous versions.
  • Y is incremented any time a new feature is added, but it doesn't break backwards compatibility if you weren't already using the feature.
  • Z is incremented any change that doesn't add a new feature or break backwards compatibility: like a bug fix.
  • No "alpha", "beta", "rc", cruft allowed. Just increment the version numbers correctly.

What I love about this is that it tells you things like:

  • 4.4.5 is just like your currently installed 4.4.2, but fixes bugs; you'd better install it.
  • 6.7.0 might have a nifty new feature compared to 6.6.3 that you have right now; what's more, it won't break anything.
  • 3.0.2 is the latest version upstream, but you have 2.19.6 ... ooh, better be careful; it might be way better software, but might not work with your data.
  • For people that don't know how version numbers work, they just pick the "highest" number, just like they do now.

Anyway, just wanted to share what I think is a good idea. I've used this scheme for over a decade for internal projects at work and it is a dream. =)

Comment by Wesley J. late Wednesday afternoon, February 1st, 2012

I might add the dots at some point. (It can only be done sanely when there's a new major version though.)

If you're manually memorizing, or comparing debhelper version numbers, you're probably doing something wrong.

Comment by joey at teatime on Tuesday, January 17th, 2012

"Normal" non-date version numbers suck, because they don't mean anything except to the developer that came up with them. To everyone else they're an arbitrary number to compare to the other arbitrary version numbers of the same program (or package).

A date is at least something everyone can understand. It tells you the release date, but doesn't tell you anything about the maturity of the code. The irony is that this is fine, because version numbers without a date also tell you nothing about the maturity of the code. The only downside to version numbers with a date is that the version numbers are long, making them slightly harder to read. These kinds of date numbers seem like they should be easier for a computer to parse, though.

I found a pretty good web link discussing the issue here:

http://www.codinghorror.com/blog/2007/02/whats-in-a-version-number-anyway.html

-- ChrisK

Comment by Christopher Monday night, January 16th, 2012

I must admit, I was quite disappointed and frustrated when I saw that debhelper also got infected by this very annoying and only hard to memorize version numbers.

With these version number scheme you can't see on a first glance if a version is the same, a higher or lower version than the one you use. Try to compare 3.20110111 with 3.20101111 or 3.20111011 without starting to count 1s, just by looking at it. This is how ikiwiki and git-annex version numbers look to me since months now: They all look the same. I can't remember anymore which version I use on which box. It's just plain annoying.

I'm sure this argument isn't "badly enough" for Joey to stop this adversity, so I have suggestion for damage mitigation: Using delimiters between YYYY, MM and DD would already help a lot to guide the eye where it has to look for each relevant token, i.e. it be easier to distinguish between 3+2011.01.11, 3+2010.11.11 and 3+2011.10.11. (I though would prefer dashes inside the date as delimiter but that doesn't work for native Debian packages as Joey uses for the mentioned software, so I used pluses for the main delimiter instead.)

Nevertheless I still prefer non-date based versioning schemes over date based version numbers.

I just used date based stuff just for packaging snapshots which have not been knowingly tagged as some release. But since I just recently have been pointed to git's describe feature, I'll likely use that for snapshots in the future, especially automatically generated snapshots. While hex git ids are not useful when continuously ascending numbers are needed, version tags in git repositories often are continuously ascending, so git describe counts the number of commits since the last tag and generates a version number out of the last tag, the number of commits and the beginning of the commit id, e.g. 0.9.4-92-gedb329a if the last tag was 0.9.4, there were 92 commits since then and the last commit is ebd329a (the g stands for git).

Comment by XTaran Monday evening, January 16th, 2012
For stable I just tack on a number, so something like 3.20100815.7.
Comment by joey mid-morning Monday, January 16th, 2012
The first number would identify the branch?
Comment by Simon.Richter terribly early Monday morning, January 16th, 2012

Hi Joey,

Thanks for this, quite interesting really. What software are you using to talk to your inverter? I maintain a GPL package in Ubuntu called Aurora (authored by Curt Blank), and publish my data to pvoutput.org (where there seems to be quite a number of open source developers who publish their PV data).

Thanks! :-Dustin

Comment by Dustin at midnight, January 3rd, 2012
I am really impressed by your journalling. Mundane and banal subject matter in large quantities is what writing is all about, I think. Perhaps it might feel like you are not writing profound enough stuff, but the fact that you write is making you better at writing. And from there you can go on and write a sci fi novel or a nonfiction book about your neurotic little sister, or a combination of the two! Or just a simple poem about dripping rain on your roof, because I think much of your writing is poetic. Then it is quality over quantity. :) Good example.
Comment by Maggie mid-morning Monday, January 2nd, 2012

To put it in perspective, about the highest I've seen my panels produce is 125 watts. Not that many lightbulbs worth.

I'm sometimes asked why I don't add more solar panels, and that's a good question; I have mostly enjoyed finding ways to make do with so little power, however.

Comment by joey Sunday evening, January 1st, 2012

My PV system is only 256 watts, and it's nearly 2 decades old (and the recycled batteries are older still). I wrote some more about it in getting to know my batteries.

It is neat to get a little power on cloudy days. Works as long as the clouds are not so thick that the day is entirely dim. My PV only produces enough on such days to "break even" with my running the laptop all day though.

I also like the effect where, on an overcast day with snowfall, the snow on the surrounding hills acts as a reflector and I get much more production that I normally would.

Comment by joey Sunday evening, January 1st, 2012