End-of-line (newline) characters for files

Thus spake viewofheaven:

“uckelman” wrote:

Can you give me an example of a file with \r\n for eol? When I search
the trunk, I don’t find any.

The file for which I just submitted a suggested patch:
EnumeratedPropertyPrompt.java

I checked this file from trunk@7961 in a hex editor and found no 0x0D
anywhere. Where exacty is the \r you’re seeing?

My favorite Emacs handles both. I just don’t wanna commit into svn
anything other than Vassal Engine’s standard end-of-line characters.

I periodically run a Perl script over the source to convert all tabs to
two spaces; it’s equally easy to remove all \r. I would say that this
is not much of an issue, as the only two possibilities, \r\n and \n, are
readable by everything that everyone uses.


J.

Joel,

I appreciate your goal in converting tabs to 2 spaces, and maybe you have a significantly better monitor than I do, but visually my eye simply cannot scan the block indent at only two spaces. so I regularly convert two spaces to tabs in all code I am working with, (which I have Eclipse converts to 3 spaces, which I see immeasurably better). Would it be possible to have your scrip skip portions of the source tree?

I second this:

I recently stumbled on some html files in the tree recently that must have had only LF after downloading, as Notepad (on WIndows) could not wrap the source. However Eclipse had no problem presenting the source properly.

Thus spake pgeerkens:

I recently stumbled on some html files in the tree recently that must
have had only LF after downloading, as Notepad (on WIndows) could not
wrap the source. However Eclipse had no problem presenting the source
properly.

Notepad is not a member of the set of reasonable text editors. :slight_smile:

As an aside: I’ve not touched the line endings in the HTML, or if I
have, it was long enough ago that I don’t remember doing it.


J.

Thus spake pgeerkens:

Joel,

I appreciate your goal in converting tabs to 2 spaces, and maybe you
have a significantly better monitor than I do, but visually my eye
simply cannot scan the block indent at only two spaces. so I regularly
convert two spaces to tabs in all code I am working with, (which I have
Eclipse converts to 3 spaces, which I see immeasurably better). Would it
be possible to have your scrip skip portions of the source tree?

I know people have different preferences on this. What I think is
clearly wrong is a mixture of spaces and tabs. What I dislike about tabs
generally is that there’s no guarantee that anything lines up the way
you intended it to when I look at it in my text editor.

I don’t take the position tabists and non-2-space indenters are heretics;
I’m happy with people using tabs and other indentation in the privacy of
their own homes, so long as I don’t have to see them.

This makes me think we should find a code formatter, so we can both
automatically have what we want. Do you know of any?


J.

Eclipse works nicely, as I can tell it to “present” tabs as any number of spaces desired. I use them so that you can tell your editor to “present” them as two spaces, while I tell Eclipse to “present” them as three spaces, and we are both happy. :slight_smile:

I will remember harder to to only use tabs for block indentation, and not internal to a line.

Agreed that Notepad is suitable only for very quick and very dirty work, given other choices, but it is occasionally useful. I once was a vi pro, but that was almost before you were born. :smiley: My finger still understand h-j-k-l though.

I too agree. Eclipse works nicely.

In the SVN repos, it is imperative that we maintain a standard set of codes for end-of-line and whitespaces. A mixture of discrepant codes for such characters will make maintenance and many operations error-prone, or at least difficult.

It will be fine if you set Eclipse to “convert tabs to spaces”, then you can tab all you want! That way, you can have all the spacing you want (eg 4 spaces) within any line, between any code elements, which makes for easier viewing.

I agree that code can be difficult to read, what with all the elements cramped onto a single line. In the SVN repos, your extra spaces within lines will show up, but it only makes for easier reading for everybody else (albeit rather non-standard code formatting).

That said, it is a good practice to separate code elements into separate lines, to make for easier reading. Especially separate arguments that are themselves complex function calls.

retval = SomeFunction(arg1, arg2,
  SomeComplexFunction(argA, argB),
  MakeYouDizzy(),
  arg5);

Every single line! I used XVI32 (hex editor) to check it too, and found the same thing that my Emacs showed me.

Is anyone else seeing this? Is my TortoiseSVN doing a number on me? My checkout of trunk shows a good 50% of files with CRLF and the other 50% with LF. No mixture of CRLF and LF within any single file, though.

This is a vital clue! On Windows 7 TortoiseSVN, a checkout does not seem to auto-convert all line endings to anything else. On Windows 7 Cygwin SVN, there does seem to be an auto-conversion of line endings to LF. So that could explain why Joel isn’t seeing the CRLF on checkout?

Oh. So you’ve been doing housekeeping for tabs? This doesn’t sound right. :frowning: Shouldn’t we put up a “code format” policy?

Yeah, we can do housekeeping for CRLF too. But we shouldn’t have to.

We should put up comprehensive code formatting policies, and HowTo(s) (for various IDEs) to adhere to such policies. That should make it easy for every developer to adhere to the code formatting policy.

Well, ok, I’ll just do a 1-second replace to convert all CRLF to LF for whatever files I’m about to commit. But I won’t do a branch-wide conversion, because I don’t wanna be presumptious.

It would seem that, by default if eol-style is not set, SVN will commit any line endings we throw at it. Hence, the invention of auto-prop and eol-style in SVN. See SVN book on eol-style. Particularly, this segment: “When this property is set to a valid value, Subversion uses it to determine what special processing to perform on the file so that the file’s line ending style isn’t flip-flopping with every commit that comes from a different operating system.

If eol-style was never set for all files in Vassal Engine’s SVN, this will mean that Vassal Engine’s SVN will now be populated with a mixture of CRLF and LF.

Last attempt to ask for sanity, before I commit my changes. Is anyone else seeing the same thing? Do note that different effects occur when checking out in different environments.

[]Windows 7, TortoiseSVN. 50% of files with CRLF and 50% with LF.[/]
[]Windows 7, Cygwin SVN, 100% of files LF[/]

Wait! This is not good for the SVN repos. If you forget to restore all the 2-space that you converted, they will be committed into SVN as “something other than it originally should be” (2 tabs in your case?).

The right way to do this is to set your Windows’ font size. If you’re using Windows 7, just open a “Windows Explorer” window, and enter for its address this: “Control Panel\Appearance and Personalization\Display” (without quotes). You can then get Windows to display fonts larger (say 150%).

There’s a reason for 2-space code format policy. Traditionally, it is to produce more compact code pages; having a printout showing wide tabs/indents will mean more paper to carry around for a given amount of code. But that is moot now, since we are “paperless”. We need to respect the project owner’s preference, given that it is reasonable, when it comes to code formatting. Personally, I think 4-space indents is better for most people (my eyes are keen, so I don’t care).

There’s also another code format policy mandating “max width of lines”. Usually, this is 79 characters thereabouts. Again, it makes for easy printouts. But more importantly, it makes for easy reading even when on screen. Note how newspapers print columns narrow, so our eyes don’t have to scan a mile across! Now, this I feel strongly about.

So, in short:

[]Spaces for tabs (MUST)[/]
[]2-space vs 4-space indents (OPTIONAL)[/]
[]79-character max line width (MUST)[/]

All those code formatting policy can be easily configured in Eclipse. When Joel gives me the green light, I’ll try to do up some HowTo(s) for that.

re:

There is no need to be insulting or rude. If you don’t brand me a heretic for 3-space indents, I won’t brand you one for insisting on ridiculously large margins and font sizes.

Actually, I believe the real reason for narrow newspaper columns is to disguise that they are written at a grade-5 reading level.

Yikes! I must have expressed myself wrong! I didn’t mean to be insulting at all!

But seriously, the right way to do it is to increase the Windows font size. I do that all the time with my smaller 21-inch monitors (am too used to 32-inch monitors when at home, good for eyes, won’t cause strain which leads to myopia, I believe).

I’m not sure if we’re on the same page regarding SVN (and other version-control) repository management. But there really are valid reasons for spaces (not tabs), as Joel had mentioned. These reasons are here to stay because most all of us have learned our lessons, and learned them painfully in most cases. A straightforward way to understand that painful lesson is this: anything that can be interpreted in more ways than one isn’t written right. In short, such a thing is called “vague”. For eg, a tab can be interpreted as 1-inch or 4-inches, depending on the editor.

Likewise, to further illustrate that “anything that can be interpreted in more ways than one isn’t written right”, and that also means “vague”… I’ll attempt to apologize for any possibility that my message was rude or insulting. As can be seen, my message was possibly vague enough that it is construed as rude by you, though construed as well-meaning by myself. (That is, I meant to make things easier for Joel, as well as assist you in adjusting to a tried-and-true practice). And that’s that! My previous message was a perfect example of a message that is just not right, since it can be interpreted in more ways than one!

If you’re thinking that my explaining how to use Windows is being condescending, you should know that I just recently learned Windows 7! No, I never went to Vista. I stuck with WinXP and avoided Vista like the plague. I thought you might be as unfamiliar with Windows 7 as I am! Many people avoided Vista like the plague. Plus, I never was much good with Windows.

About 4-space indents, it actually is a norm for Java code. I don’t know why. Maybe it has a reason tied to being Object-Oriented, not sure. I only know that most people I work with hate 2-space indents (except hardcore masochistic veteran coders, I think).

As for 3-space indents, I don’t know why 2-space and 4-space were the norms in the first place. Maybe 2-space was better than 1-space, and the new exercise to “increase the indent” simply doubled the 2-space (equals 4-space) indent?

Oh. So that’s why grammar errors are the norm there? Hmm. I always thought newspaper folks were rushed for time, and language mistakes were somewhat forgivable. Gosh, my grade-5 English must be bad. Even now, most words in newspapers are beyond me. (I’m not a native English-speaker, by the way).

Thus spake pgeerkens:

Agreed that Notepad is suitable only for very quick and very dirty work,
given other choices, but it is occasionally useful. I once was a vi
pro, but that was almost before you were born. :smiley: My finger still
understand h-j-k-l though.

I use Vim more than any other single application. Why’d you stop using
vi?


J.

Thus spake viewofheaven:

“pgeerkens” wrote:

I will remember harder to to only use tabs for block indentation, and
not internal to a line.

It will be fine if you set Eclipse to “convert tabs to spaces”, then you
can tab all you want! That way, you can have all the spacing you want
(eg 4 spaces) within any line, between any code elements, which makes
for easier viewing.

If I understand what “convert tabs to spaces does”, then this doesn’t
address Pieter’s point. He’d like to have 3-space indentation, but this
setting will have no effect if there are no tabs in our files.


J.

Thus spake viewofheaven:

There’s a reason for 2-space code format policy. Traditionally, it is to
produce more compact code pages; having a printout showing wide
tabs/indents will mean more paper to carry around for a given amount of
code. But that is moot now, since we are “paperless”. We need to respect
the project owner’s preference, given that it is reasonable, when it
comes to code formatting. Personally, I think 4-space indents is better
for most people (my eyes are keen, so I don’t care).

My reason for 2-space indent is that it keeps indentation from becoming
too deep too quickly. Two spaces is by far the most common indentation
policy in code I see these days.

There’s also another code format policy mandating “max width of lines”.
Usually, this is 79 characters thereabouts. Again, it makes for easy
printouts. But more importantly, it makes for easy reading even when on
screen. Note how newspapers print columns narrow, so our eyes don’t have
to scan a mile across! Now, this I feel strongly about.

So, in short:

  • Spaces for tabs (MUST)
  • 2-space vs 4-space indents (OPTIONAL)
  • 79-character max line width (MUST)

I’m pretty adamant about having 2-space indentation in the code as
stored in the repo. I think the solution to Pieter’s problem is a code
formatter.


J.

Thus spake viewofheaven:

“uckelman” wrote:

“viewofheaven” wrote:

The file for which I just submitted a suggested patch:
EnumeratedPropertyPrompt.java

I checked this file from trunk@7961 in a hex editor and found no 0x0D
anywhere. Where exacty is the \r you’re seeing?

Every single line! I used XVI32 (hex editor) to check it too, and found
the same thing that my Emacs showed me.

Is anyone else seeing this? Is my TortoiseSVN doing a number on me? My
checkout of trunk shows a good 50% of files with CRLF and the other 50%
with LF. No mixture of CRLF and LF within any single file, though.

I checked again just to be sure. I see no CRs there at all.

“pgeerkens” wrote:

I recently stumbled on some html files in the tree recently that must
have had only LF after downloading

This is a vital clue! On Windows 7 TortoiseSVN, a checkout does not seem
to auto-convert all line endings to anything else. On Windows 7 Cygwin
SVN, there does seem to be an auto-conversion of line endings to LF.
So that could explain why Joel isn’t seeing the CRLF on checkout?

I’m not using Cygwin. I’m not even using Subversion everywhere. I am for
3.1, but for the trunk, I’m using git as a Subversion client. I see no
CRs in either one.

We should put up comprehensive code formatting policies, and HowTo(s)
(for various IDEs) to adhere to such policies. That should make it easy
for every developer to adhere to the code formatting policy.

I intend to do this for V4.


J.

Nothing against vi - I would still use it in the right circumstance, and have once or twice installed it (and sed) to windows - but I stopped working for companies that coded for UNIX. Also, the functionality gap between vi and other code editors narrowed.

Well, how about Windows 7 with TortoiseSVN? Have you checked that? I really am sure there are CRLF in 50% of the files.

If it were all CRLF, I would think there’s been some auto-conversion upon check out. But there are CRLF in just 50% of the files.

Anyway, I’ll just commit my changes using LF for whatever files I’m committing.

Was just trying to give you a heads up regarding possible corruption of line endings in SVN repos.

Thus spake viewofheaven:

“uckelman” wrote:

I’m not using Cygwin. I’m not even using Subversion everywhere. I am
for 3.1, but for the trunk, I’m using git as a Subversion client. I
see no CRs in either one.

Well, how about Windows 7 with TortoiseSVN? Have you checked that? I
really am sure there are CRLF in 50% of the files.

I can’t, I don’t have access to Windows 7.

Anyway, I’ll just commit my changes using LF for whatever files I’m
committing.

Was just trying to give you a heads up regarding possible corruption of
line endings in SVN repos.

I appreciate the notice, but this should be ok, really, since every text
editor that anyone codes in can handle it.


J.

Actually, I was worried about inconsistent character codes (for line endings) in SVN repos. That could throw us some problems later on. At the least, it means a systemic irregularity that should be resolved, much like how “tabs must always be represented by spaces, or vice versa, all the time, choose one or the other”.

Another file that has CRLF: PieceMover.java in package “VASSAL.build.module.map”.

I’ll commit my changes in LF, as per the standard SVN repos practice.

I think astyle might be the code formatter we’re looking for.