End-of-line (newline) characters for files

A quick check seems to indicate that a good 50% of files have CRLF, and the other 50% have LF.

The file I just worked on (and haven’t dared commit yet) is CRLF.

Is it possible/desirable to do “enable-auto-props = yes” and “* = svn:eol-style=LF”? Or am I mistaken, and that auto-props only works on the SVN client side?

Bug 1926 is fixed, by the way. Somebody tell me what I should do with CRLF, and I’ll commit that patch I attached there.

Thus spake viewofheaven:

A quick check seems to indicate that a good 50% of files have CRLF, and
the other 50% have LF.

Can you give me an example of a file with \r\n for eol? When I search
the trunk, I don’t find any.

The file I just worked on (and haven’t dared commit yet) is CRLF.

Any sane text editor should be able to handle both.

Is it possible/desirable to do “enable-auto-props = yes” and “* =
svn:eol-style=LF”? Or am I mistaken, and that auto-props only works on
the SVN client side?

I believe this sets svn:eol-style on all files you commit, so it’s not
just client-side.


J.

I would think that if anything, one would want svn:eol-style=native

With that setting, the files in the repository are stored with LF and
on the client with whatever is native to that OS, namely LF on unix
and CRLF on Windows. The EOL translation is done on check-in and
commit and only affects the client files, so it is pretty transparent.

[Repost]

The file for which I just submitted a suggested patch: EnumeratedPropertyPrompt.java
In the same package, GlobalPropertiesContainer.java has LF. And so does the Makefile on root, if that gives any clue about the state of things.

My favorite Emacs handles both. I just don’t wanna commit into svn anything other than Vassal Engine’s standard end-of-line characters.

Oh. But “enable-auto-props” doesn’t work for “checking out”? I have “enable-auto-props”, and my checkout of trunk still gives me 50% CRLF and 50% LF.

I still don’t dare commit my changes to EnumeratedPropertyPrompt.java because it obviously (to me and my Windows 7) has CRLF end-of-lines.

That is absolutely great, and good practice. I only hope that Vassal Engine’s SVN repos isn’t already a mixture of CRLF and LF. And most importantly, I wanna know that I’m not committing any non-standard end-of-line characters.

Thus spake viewofheaven:

“tar” wrote:

I would think that if anything, one would want svn:eol-style=native

With that setting, the files in the repository are stored with LF and
on the client with whatever is native to that OS, namely LF on unix
and CRLF on Windows. The EOL translation is done on check-in and
commit and only affects the client files, so it is pretty transparent.

That is absolutely great, and good practice. I only hope that Vassal
Engine’s SVN repos isn’t already a mixture of CRLF and LF. And most
importantly, I wanna know that I’m not committing any non-standard
end-of-line characters.

I’m inclined to say that this is not good practice in general, though
in our particular case it’s harmless. The fact that SVN is capable of
checking in something other than a byte-for-byte copy of the file you
give it is a total misfeature, and could in some cases result in
corruption of your file (e.g., if a binary file is misidentified). git
handles this properly, by accepting what you check in.


J.

Thus spake viewofheaven:

“uckelman” wrote:

Can you give me an example of a file with \r\n for eol? When I search
the trunk, I don’t find any.

The file for which I just submitted a suggested patch:
EnumeratedPropertyPrompt.java

I checked this file from trunk@7961 in a hex editor and found no 0x0D
anywhere. Where exacty is the \r you’re seeing?

My favorite Emacs handles both. I just don’t wanna commit into svn
anything other than Vassal Engine’s standard end-of-line characters.

I periodically run a Perl script over the source to convert all tabs to
two spaces; it’s equally easy to remove all \r. I would say that this
is not much of an issue, as the only two possibilities, \r\n and \n, are
readable by everything that everyone uses.


J.

Joel,

I appreciate your goal in converting tabs to 2 spaces, and maybe you have a significantly better monitor than I do, but visually my eye simply cannot scan the block indent at only two spaces. so I regularly convert two spaces to tabs in all code I am working with, (which I have Eclipse converts to 3 spaces, which I see immeasurably better). Would it be possible to have your scrip skip portions of the source tree?

I second this:

I recently stumbled on some html files in the tree recently that must have had only LF after downloading, as Notepad (on WIndows) could not wrap the source. However Eclipse had no problem presenting the source properly.

Thus spake pgeerkens:

I recently stumbled on some html files in the tree recently that must
have had only LF after downloading, as Notepad (on WIndows) could not
wrap the source. However Eclipse had no problem presenting the source
properly.

Notepad is not a member of the set of reasonable text editors. :slight_smile:

As an aside: I’ve not touched the line endings in the HTML, or if I
have, it was long enough ago that I don’t remember doing it.


J.

Thus spake pgeerkens:

Joel,

I appreciate your goal in converting tabs to 2 spaces, and maybe you
have a significantly better monitor than I do, but visually my eye
simply cannot scan the block indent at only two spaces. so I regularly
convert two spaces to tabs in all code I am working with, (which I have
Eclipse converts to 3 spaces, which I see immeasurably better). Would it
be possible to have your scrip skip portions of the source tree?

I know people have different preferences on this. What I think is
clearly wrong is a mixture of spaces and tabs. What I dislike about tabs
generally is that there’s no guarantee that anything lines up the way
you intended it to when I look at it in my text editor.

I don’t take the position tabists and non-2-space indenters are heretics;
I’m happy with people using tabs and other indentation in the privacy of
their own homes, so long as I don’t have to see them.

This makes me think we should find a code formatter, so we can both
automatically have what we want. Do you know of any?


J.

Eclipse works nicely, as I can tell it to “present” tabs as any number of spaces desired. I use them so that you can tell your editor to “present” them as two spaces, while I tell Eclipse to “present” them as three spaces, and we are both happy. :slight_smile:

I will remember harder to to only use tabs for block indentation, and not internal to a line.

Agreed that Notepad is suitable only for very quick and very dirty work, given other choices, but it is occasionally useful. I once was a vi pro, but that was almost before you were born. :smiley: My finger still understand h-j-k-l though.

I too agree. Eclipse works nicely.

In the SVN repos, it is imperative that we maintain a standard set of codes for end-of-line and whitespaces. A mixture of discrepant codes for such characters will make maintenance and many operations error-prone, or at least difficult.

It will be fine if you set Eclipse to “convert tabs to spaces”, then you can tab all you want! That way, you can have all the spacing you want (eg 4 spaces) within any line, between any code elements, which makes for easier viewing.

I agree that code can be difficult to read, what with all the elements cramped onto a single line. In the SVN repos, your extra spaces within lines will show up, but it only makes for easier reading for everybody else (albeit rather non-standard code formatting).

That said, it is a good practice to separate code elements into separate lines, to make for easier reading. Especially separate arguments that are themselves complex function calls.

retval = SomeFunction(arg1, arg2,
  SomeComplexFunction(argA, argB),
  MakeYouDizzy(),
  arg5);

Every single line! I used XVI32 (hex editor) to check it too, and found the same thing that my Emacs showed me.

Is anyone else seeing this? Is my TortoiseSVN doing a number on me? My checkout of trunk shows a good 50% of files with CRLF and the other 50% with LF. No mixture of CRLF and LF within any single file, though.

This is a vital clue! On Windows 7 TortoiseSVN, a checkout does not seem to auto-convert all line endings to anything else. On Windows 7 Cygwin SVN, there does seem to be an auto-conversion of line endings to LF. So that could explain why Joel isn’t seeing the CRLF on checkout?

Oh. So you’ve been doing housekeeping for tabs? This doesn’t sound right. :frowning: Shouldn’t we put up a “code format” policy?

Yeah, we can do housekeeping for CRLF too. But we shouldn’t have to.

We should put up comprehensive code formatting policies, and HowTo(s) (for various IDEs) to adhere to such policies. That should make it easy for every developer to adhere to the code formatting policy.

Well, ok, I’ll just do a 1-second replace to convert all CRLF to LF for whatever files I’m about to commit. But I won’t do a branch-wide conversion, because I don’t wanna be presumptious.

It would seem that, by default if eol-style is not set, SVN will commit any line endings we throw at it. Hence, the invention of auto-prop and eol-style in SVN. See SVN book on eol-style. Particularly, this segment: “When this property is set to a valid value, Subversion uses it to determine what special processing to perform on the file so that the file’s line ending style isn’t flip-flopping with every commit that comes from a different operating system.

If eol-style was never set for all files in Vassal Engine’s SVN, this will mean that Vassal Engine’s SVN will now be populated with a mixture of CRLF and LF.

Last attempt to ask for sanity, before I commit my changes. Is anyone else seeing the same thing? Do note that different effects occur when checking out in different environments.

[]Windows 7, TortoiseSVN. 50% of files with CRLF and 50% with LF.[/]
[]Windows 7, Cygwin SVN, 100% of files LF[/]

Wait! This is not good for the SVN repos. If you forget to restore all the 2-space that you converted, they will be committed into SVN as “something other than it originally should be” (2 tabs in your case?).

The right way to do this is to set your Windows’ font size. If you’re using Windows 7, just open a “Windows Explorer” window, and enter for its address this: “Control Panel\Appearance and Personalization\Display” (without quotes). You can then get Windows to display fonts larger (say 150%).

There’s a reason for 2-space code format policy. Traditionally, it is to produce more compact code pages; having a printout showing wide tabs/indents will mean more paper to carry around for a given amount of code. But that is moot now, since we are “paperless”. We need to respect the project owner’s preference, given that it is reasonable, when it comes to code formatting. Personally, I think 4-space indents is better for most people (my eyes are keen, so I don’t care).

There’s also another code format policy mandating “max width of lines”. Usually, this is 79 characters thereabouts. Again, it makes for easy printouts. But more importantly, it makes for easy reading even when on screen. Note how newspapers print columns narrow, so our eyes don’t have to scan a mile across! Now, this I feel strongly about.

So, in short:

[]Spaces for tabs (MUST)[/]
[]2-space vs 4-space indents (OPTIONAL)[/]
[]79-character max line width (MUST)[/]

All those code formatting policy can be easily configured in Eclipse. When Joel gives me the green light, I’ll try to do up some HowTo(s) for that.

re:

There is no need to be insulting or rude. If you don’t brand me a heretic for 3-space indents, I won’t brand you one for insisting on ridiculously large margins and font sizes.

Actually, I believe the real reason for narrow newspaper columns is to disguise that they are written at a grade-5 reading level.

Yikes! I must have expressed myself wrong! I didn’t mean to be insulting at all!

But seriously, the right way to do it is to increase the Windows font size. I do that all the time with my smaller 21-inch monitors (am too used to 32-inch monitors when at home, good for eyes, won’t cause strain which leads to myopia, I believe).

I’m not sure if we’re on the same page regarding SVN (and other version-control) repository management. But there really are valid reasons for spaces (not tabs), as Joel had mentioned. These reasons are here to stay because most all of us have learned our lessons, and learned them painfully in most cases. A straightforward way to understand that painful lesson is this: anything that can be interpreted in more ways than one isn’t written right. In short, such a thing is called “vague”. For eg, a tab can be interpreted as 1-inch or 4-inches, depending on the editor.

Likewise, to further illustrate that “anything that can be interpreted in more ways than one isn’t written right”, and that also means “vague”… I’ll attempt to apologize for any possibility that my message was rude or insulting. As can be seen, my message was possibly vague enough that it is construed as rude by you, though construed as well-meaning by myself. (That is, I meant to make things easier for Joel, as well as assist you in adjusting to a tried-and-true practice). And that’s that! My previous message was a perfect example of a message that is just not right, since it can be interpreted in more ways than one!

If you’re thinking that my explaining how to use Windows is being condescending, you should know that I just recently learned Windows 7! No, I never went to Vista. I stuck with WinXP and avoided Vista like the plague. I thought you might be as unfamiliar with Windows 7 as I am! Many people avoided Vista like the plague. Plus, I never was much good with Windows.

About 4-space indents, it actually is a norm for Java code. I don’t know why. Maybe it has a reason tied to being Object-Oriented, not sure. I only know that most people I work with hate 2-space indents (except hardcore masochistic veteran coders, I think).

As for 3-space indents, I don’t know why 2-space and 4-space were the norms in the first place. Maybe 2-space was better than 1-space, and the new exercise to “increase the indent” simply doubled the 2-space (equals 4-space) indent?

Oh. So that’s why grammar errors are the norm there? Hmm. I always thought newspaper folks were rushed for time, and language mistakes were somewhat forgivable. Gosh, my grade-5 English must be bad. Even now, most words in newspapers are beyond me. (I’m not a native English-speaker, by the way).

Thus spake pgeerkens:

Agreed that Notepad is suitable only for very quick and very dirty work,
given other choices, but it is occasionally useful. I once was a vi
pro, but that was almost before you were born. :smiley: My finger still
understand h-j-k-l though.

I use Vim more than any other single application. Why’d you stop using
vi?


J.

Thus spake viewofheaven:

“pgeerkens” wrote:

I will remember harder to to only use tabs for block indentation, and
not internal to a line.

It will be fine if you set Eclipse to “convert tabs to spaces”, then you
can tab all you want! That way, you can have all the spacing you want
(eg 4 spaces) within any line, between any code elements, which makes
for easier viewing.

If I understand what “convert tabs to spaces does”, then this doesn’t
address Pieter’s point. He’d like to have 3-space indentation, but this
setting will have no effect if there are no tabs in our files.


J.

Thus spake viewofheaven:

There’s a reason for 2-space code format policy. Traditionally, it is to
produce more compact code pages; having a printout showing wide
tabs/indents will mean more paper to carry around for a given amount of
code. But that is moot now, since we are “paperless”. We need to respect
the project owner’s preference, given that it is reasonable, when it
comes to code formatting. Personally, I think 4-space indents is better
for most people (my eyes are keen, so I don’t care).

My reason for 2-space indent is that it keeps indentation from becoming
too deep too quickly. Two spaces is by far the most common indentation
policy in code I see these days.

There’s also another code format policy mandating “max width of lines”.
Usually, this is 79 characters thereabouts. Again, it makes for easy
printouts. But more importantly, it makes for easy reading even when on
screen. Note how newspapers print columns narrow, so our eyes don’t have
to scan a mile across! Now, this I feel strongly about.

So, in short:

  • Spaces for tabs (MUST)
  • 2-space vs 4-space indents (OPTIONAL)
  • 79-character max line width (MUST)

I’m pretty adamant about having 2-space indentation in the code as
stored in the repo. I think the solution to Pieter’s problem is a code
formatter.


J.