23
Sat, Nov
1 New Articles

The Linux Letter: Subversion

Typography
  • Smaller Small Medium Big Bigger
  • Default Helvetica Segoe Georgia Times

I have been a fan of Concurrent Versions System (CVS) for many years and have used it to manage the changes not only to my software, but to my server configurations as well. No doubt about it, CVS has been an excellent software assistant for me (and innumerable others). But there's a new game in town. As CVS was created to enhance and address the shortcomings of its predecessor, Revision Control System (RCS), so too was Subversion created to enhance and address the shortcomings of CVS. This month, we'll look at the new-and-improved version control system that has become my favorite.

CVS Redux

As I wrote in an earlier article, CVS is a handy tool that allows you to keep track of all changes to your source code. Multiple users can be working in the same project--even in the same source file--and CVS will keep track of who changed what and when. Should there be a conflict caused by two developers working on the same section of a source file, CVS will call out the area in conflict and help you merge the changes successfully. While CVS is quite powerful, it has a few warts that can make its use somewhat trying.

Number one on my list of CVS warts is that, in CVS, it's a bear to rearrange your source file directory structure. While it's easy enough to physically move the directories on your checked-out version (using either a command line utility or drag-and-drop), telling CVS that you have done this requires a convoluted incantation consisting of multiple 'cvs add' and 'cvs remove' commands, which gets even worse if multiple developers have the same project checked out. Now, I'm sure that someone will point me to some drag-and-drop tool that invokes the appropriate deities to accomplish this seemingly simple task, but I've learned to give some thought to my source tree directory structure prior to starting a project. It just makes things easier.

Number two on my list of CVS warts is that CVS doesn't do binary files, such as jar files, word processing documents, or zip files. Let me rephrase that. CVS doesn't do binary files well. Should you have the need to store these in your CVS repository, you'll need to take care to let CVS know that the file is binary when you add it. Adding a file using the command form 'cvs add -kb' tells CVS that the file is binary and deserves special treatment, exempting it from attempts to expand embedded CVS keywords that may spuriously appear within the file. Once so tagged, CVS will simply copy the file whenever it is checked out or committed to the repository. Failure to add the tag can result in a corrupted file, so if you use CVS, keep the -kb switch in mind.

Other developers have their own personal favorites that could be added to my two items, and if you do any googling for "CVS shortcomings," "CVS design flaws," or "CVS annoyances," you'll get a good sampling of them.

A Better CVS

I have been aware of the Subversion project for a lot longer than I have been using it, having seen references to it at Web sites that I visit and in some articles I have read. Like most of you, I tend to stick with tools that have become "comfortable" to use, and to me, CVS fits into that category, shortcomings notwithstanding. Thus, I found no real reason to investigate Subversion--until my projects started to incorporate an increasing number of binary file types, such as Java jar files, OpenOffice.org documents, and various graphic files. CVS's handling of such files was becoming enough of an impediment to inspire me to visit the Subversion Project's Web site to learn more about it. A short description of the project appears on the site and states simply, "The goal of the Subversion project is to build a version control system that is a compelling replacement for CVS in the open source community. The software is released under an Apache/BSD-style open source license." My interest was piqued even further when I read the first entry in their features list: "Subversion is meant to be a better CVS, so it has most of CVS's features. Generally, Subversion's interface to a particular feature is similar to CVS's, except where there's a compelling reason to do otherwise." That entry alone gave me enough incentive to download and install Subversion on my laptop.

Once the software was loaded, I needed to spend only a few minutes reading the documentation provided on the Web site (click on the link for "Subversion Book") before I realized just how similar the tools were. There is literally a one-to-one mapping of CVS functionality to Subversion functionality. The only major difference between the two is that with CVS, everything is done with variations on one command, cvs. With Subversion, the suite is split into a few commands, based on whether the desired function is administrative (to operate on the repository itself) or client-oriented (to operate on projects within the repository). Thus, instead of issuing the command cvs init to initialize the repository (as in CVS), I instead issued the command svnadmin create.

Initializing a repository is a one-time operation. For the things I do daily, the similarities between the two packages are amazing. The commands to import, check out, add, remove, and commit projects and files are cvs import, cvs co, cvs add, cvs del, and cvs commit, respectively. Change "cvs" to "svn" in those CVS commands and you have the Subversion command for the equivalent function. That meant that all I needed to do to make the switch to Subversion was to reprogram my memory to type "svn" instead of "cvs." It didn't take very long before my transition was complete.

So Much Faster

If the only difference between the two packages was the command used, there wouldn't be much point to a transition from CVS to Subversion. Fortunately, that isn't the case.

The thing about CVS that I found most irritating was the incredible pain induced by a reorganization of a project's source code directory structure. It's something that can be done with CVS, but not easily. One of the commands that CVS lacks is Subversion's svn move command, which makes rearranging a project simple. Let's say I have a project that contains two directories: DirA and DirB. In directory DirA, I have a file, X, that I want to move to DirB. In Subversion, the command svn move DirA/X DirB does automatically what CVS requires you to do manually; it physically moves the file from DirA to DirB, issues an svn delete DirA/X command, and issues an svn add DirB/X command. This works on individual files, groups of files, or entire directories. Talk about a welcome improvement!

As to my issues with CVS's handling of binary file types, I can only say that with Subversion, they're all gone. I haven't had one single corruption of a binary file since I started using Subversion. The reason for this can be found the Subversion FAQ:

How does Subversion handle binary files?

When you first add or import a file into Subversion, the file is examined to determine if it is a binary file. Currently, Subversion just looks at the first 1024 bytes of the file; if any of the bytes are zero, or if more than 15% are not ASCII printing characters, then Subversion calls the file binary. This heuristic might be improved in the future, however.

If Subversion determines that the file is binary, the file receives an svn:mime-type property set to "application/octet-stream". (You can always override this by using the auto-props feature or by setting the property manually with svn propset.)

Subversion treats the following files as text:

  • Files with no svn:mime-type
  • Files with a svn:mime-type starting "text/"
  • Files with a svn:mime-type equal to "image/x-xbitmap"
  •  Files with a svn:mime-type equal to "image/x-xpixmap"

All other files are treated as binary, meaning that Subversion will:

  • Not attempt to automatically merge received changes with local changes during svn update or svn merge
  • Not show the differences as part of svn diff
  • Not show line-by-line attribution for svn blame

In all other respects, Subversion treats binary files the same as text files, e.g. if you set the svn:keywords or svn:eol-style properties, Subversion will perform keyword substitution or newline conversion on binary files.

Note that whether or not a file is binary does not affect the amount of repository space used to store changes to that file, nor does it affect the amount of traffic between client and server. For storage and transmission purposes, Subversion uses a diffing method that works equally well on binary and text files; this is completely unrelated to the diffing method used by the 'svn diff' command.

Anyone who uses CVS has embedded CVS keywords (such as $Id$) in their source code so that CVS will expand them automatically upon a commit. In the source code, CVS expands keywords such as $Id$ into strings like $Id: cvs-notes.html,v 1.2 2001/02/08 05:16:06 joeuser Exp $, allowing you to document the version, date, time of commit, etc. This is CVS's default behavior, which gets turned off when you use the -kb switch. Subversion does the opposite and won't attempt to expand any keywords unless you specifically tag a file to enable expansion on keywords that the file contains. So it's unlikely that a keyword in a binary file will randomly get expanded and thus corrupt the file.

Besides automatically identifying binary files, Subversion also uses a binary diffing algorithm to send changes when a commit is requested, so only the changed parts are sent. CVS, on the other hand, copies the entire file. While this functionality may be irrelevant if you're attached to a high-speed network, it's wonderful if you're using a dial-up connection. And even if you are on a high-speed network, are you patient enough to wait for those huge jar files to get transferred in their entirety every time they change? Subversion makes the whole process so much faster.

Client Options

Subversion repositories can be accessed in a variety of ways. Clients wishing to access repositories hosted on their own machine have it the easiest, as no additional configuration is required. Just create the repository using the command svnadmin create /path/to/repository (if you haven't already done so) and start it using the svn command with a URL of the form file:///path/to/repository. Clients wishing to access remote repositories have little more to do than to change the URL to point to the correct server, with the correct protocol.

Setting up a remote server for access to Subversion repositories isn't difficult. The instructions to do that are included in the Subversion Book. The administrator can elect to provide access using Subversion's own server (either directly or over SSH) or by adding a module (WebDAV/SVN) to an Apache 2 Web server. Which method is most appropriate is dictated by your security requirements and the type of clients that you wish to support. Again, consult the book for further information.

Once you have a server configured, the only difference to the client is in the protocol specified in the URL. The client commands are all the same; only the URL changes. Local access uses the file:// URL, whereas remote access URLs are svn://, svn+ssh://, and http://. At my shop, I have set up Apache 2 to dole out access (so my URLs are http://) as this makes it easy to navigate the repositories using only a Web browser.

Tempered Enthusiasm

Having placed a considerable quantity of code under the management of CVS, I found my enthusiasm for a commitment to Subversion (pun intended) to be somewhat tempered. Even though I knew that, for me, a switch was inevitable, I had to decide what to do with the existing repository. One strategy that I briefly considered was to leave in CVS those projects that were already in CVS but to put new projects into Subversion. I discounted that idea because I wasn't interested in any additional maintenance headaches or confusion that would result from running multiple version control systems. Another idea that I entertained for even less time was the concept of leaving up the old CVS repository and then, when preparing to work on an existing project, exporting the project from CVS (using cvs export) and then importing the project into Subversion. This had all of the disadvantages of my first strategy while adding the loss of the project history and version tags. Yuck!

Fortunately, there is a wonderful tool called cvs2svn that will migrate your CVS repository to a Subversion repository. It worked fine for me, but I will give you this tip. The Web site says, "It [cvs2svn] is designed for one-time conversions, not for repeated synchronizations between CVS and Subversion." That's an understatement. If you decide to do a conversion, do yourself a big favor and do it into a freshly created repository. My first attempt caused the corruption of my current Subversion repository. Fortunately, I had made a backup prior to the conversion, so recovery was trivial. I'm sure that my enthusiasm for Subversion would have been tempered even more had I not taken the precaution of a backup.

Tool Integration

The longevity of CVS has resulted in its integration into many popular programming tools. As one example, we have Eclipse, the tool du jour for i5 programmers. Through the use of the Eclipse CVS plug-in, repository access is made quite simple. If you are a CVS/Eclipse user, you'll be happy to learn that there is a Subversion plug-in (called Subclipse) that can make your switch to that version control program seamless.

Given the rapid adoption rate of Subversion in the open-source world, I would bet that your favorite development tool supports Subversion right now. And if it doesn't, I'd bet that it won't be long before it does, assuming that it already supports CVS.

Runs on i5/OS

Your choice of server on which to host Subversion isn't limited to Linux. As listed on the site, you have a choice of hosting your repository on "all modern flavors of Unix, Win32, BeOS, OS/2, and MacOS X." Best of all, you can even host your Subversion repositories on i5/OS (V5R1 and above).

If you are using CVS, you ought to consider Subversion. If you're not using any version management tool, you ought to consider Subversion. The flexibility of Subversion makes it useful not only for programming projects (which is what most people use it for) but for any computer-related projects, such as presentations, documentation, or anything else you may store on a computer. Once you get into the habit of using a version control system, you'll wonder how you ever got along without it.

OhioLinux 2005

Last year, I wrote a column about the remarkable Ohio LinuxFest technology conference. The intrepid volunteers who made this happen are doing it again. On Saturday, October 1, 2005, the Ohio LinuxFest will be held in Columbus, Ohio. There is no registration fee to attend the event, and judging by the list of presenters, it should be even better this year than it was last. I hope to see you there!

 

Barry L. Kline is a consultant and has been developing software on various DEC and IBM midrange platforms for over 21 years. Barry discovered Linux back in the days when it was necessary to download diskette images and source code from the Internet. Since then, he has installed Linux on hundreds of machines, where it functions as servers and workstations in iSeries and Windows networks. He co-authored the book Understanding Linux Web Hosting with Don Denoncourt. Barry can be reached at This email address is being protected from spambots. You need JavaScript enabled to view it..

Barry Kline 0

Barry L. Kline is a consultant and has been developing software on various DEC and IBM midrange platforms since the early 1980s. Barry discovered Linux back in the days when it was necessary to download diskette images and source code from the Internet. Since then, he has installed Linux on hundreds of machines, where it functions as servers and workstations in iSeries and Windows networks. He co-authored the book Understanding Web Hosting on Linux with Don Denoncourt. Barry can be reached at This email address is being protected from spambots. You need JavaScript enabled to view it.

BLOG COMMENTS POWERED BY DISQUS

LATEST COMMENTS

Support MC Press Online

$

Book Reviews

Resource Center

  • SB Profound WC 5536 Have you been wondering about Node.js? Our free Node.js Webinar Series takes you from total beginner to creating a fully-functional IBM i Node.js business application. You can find Part 1 here. In Part 2 of our free Node.js Webinar Series, Brian May teaches you the different tooling options available for writing code, debugging, and using Git for version control. Brian will briefly discuss the different tools available, and demonstrate his preferred setup for Node development on IBM i or any platform. Attend this webinar to learn:

  • SB Profound WP 5539More than ever, there is a demand for IT to deliver innovation. Your IBM i has been an essential part of your business operations for years. However, your organization may struggle to maintain the current system and implement new projects. The thousands of customers we've worked with and surveyed state that expectations regarding the digital footprint and vision of the company are not aligned with the current IT environment.

  • SB HelpSystems ROBOT Generic IBM announced the E1080 servers using the latest Power10 processor in September 2021. The most powerful processor from IBM to date, Power10 is designed to handle the demands of doing business in today’s high-tech atmosphere, including running cloud applications, supporting big data, and managing AI workloads. But what does Power10 mean for your data center? In this recorded webinar, IBMers Dan Sundt and Dylan Boday join IBM Power Champion Tom Huntington for a discussion on why Power10 technology is the right strategic investment if you run IBM i, AIX, or Linux. In this action-packed hour, Tom will share trends from the IBM i and AIX user communities while Dan and Dylan dive into the tech specs for key hardware, including:

  • Magic MarkTRY the one package that solves all your document design and printing challenges on all your platforms. Produce bar code labels, electronic forms, ad hoc reports, and RFID tags – without programming! MarkMagic is the only document design and print solution that combines report writing, WYSIWYG label and forms design, and conditional printing in one integrated product. Make sure your data survives when catastrophe hits. Request your trial now!  Request Now.

  • SB HelpSystems ROBOT GenericForms of ransomware has been around for over 30 years, and with more and more organizations suffering attacks each year, it continues to endure. What has made ransomware such a durable threat and what is the best way to combat it? In order to prevent ransomware, organizations must first understand how it works.

  • SB HelpSystems ROBOT GenericIT security is a top priority for businesses around the world, but most IBM i pros don’t know where to begin—and most cybersecurity experts don’t know IBM i. In this session, Robin Tatam explores the business impact of lax IBM i security, the top vulnerabilities putting IBM i at risk, and the steps you can take to protect your organization. If you’re looking to avoid unexpected downtime or corrupted data, you don’t want to miss this session.

  • SB HelpSystems ROBOT GenericCan you trust all of your users all of the time? A typical end user receives 16 malicious emails each month, but only 17 percent of these phishing campaigns are reported to IT. Once an attack is underway, most organizations won’t discover the breach until six months later. A staggering amount of damage can occur in that time. Despite these risks, 93 percent of organizations are leaving their IBM i systems vulnerable to cybercrime. In this on-demand webinar, IBM i security experts Robin Tatam and Sandi Moore will reveal:

  • FORTRA Disaster protection is vital to every business. Yet, it often consists of patched together procedures that are prone to error. From automatic backups to data encryption to media management, Robot automates the routine (yet often complex) tasks of iSeries backup and recovery, saving you time and money and making the process safer and more reliable. Automate your backups with the Robot Backup and Recovery Solution. Key features include:

  • FORTRAManaging messages on your IBM i can be more than a full-time job if you have to do it manually. Messages need a response and resources must be monitored—often over multiple systems and across platforms. How can you be sure you won’t miss important system events? Automate your message center with the Robot Message Management Solution. Key features include:

  • FORTRAThe thought of printing, distributing, and storing iSeries reports manually may reduce you to tears. Paper and labor costs associated with report generation can spiral out of control. Mountains of paper threaten to swamp your files. Robot automates report bursting, distribution, bundling, and archiving, and offers secure, selective online report viewing. Manage your reports with the Robot Report Management Solution. Key features include:

  • FORTRAFor over 30 years, Robot has been a leader in systems management for IBM i. With batch job creation and scheduling at its core, the Robot Job Scheduling Solution reduces the opportunity for human error and helps you maintain service levels, automating even the biggest, most complex runbooks. Manage your job schedule with the Robot Job Scheduling Solution. Key features include:

  • LANSA Business users want new applications now. Market and regulatory pressures require faster application updates and delivery into production. Your IBM i developers may be approaching retirement, and you see no sure way to fill their positions with experienced developers. In addition, you may be caught between maintaining your existing applications and the uncertainty of moving to something new.

  • LANSAWhen it comes to creating your business applications, there are hundreds of coding platforms and programming languages to choose from. These options range from very complex traditional programming languages to Low-Code platforms where sometimes no traditional coding experience is needed. Download our whitepaper, The Power of Writing Code in a Low-Code Solution, and:

  • LANSASupply Chain is becoming increasingly complex and unpredictable. From raw materials for manufacturing to food supply chains, the journey from source to production to delivery to consumers is marred with inefficiencies, manual processes, shortages, recalls, counterfeits, and scandals. In this webinar, we discuss how:

  • The MC Resource Centers bring you the widest selection of white papers, trial software, and on-demand webcasts for you to choose from. >> Review the list of White Papers, Trial Software or On-Demand Webcast at the MC Press Resource Center. >> Add the items to yru Cart and complet he checkout process and submit

  • Profound Logic Have you been wondering about Node.js? Our free Node.js Webinar Series takes you from total beginner to creating a fully-functional IBM i Node.js business application.

  • SB Profound WC 5536Join us for this hour-long webcast that will explore:

  • Fortra IT managers hoping to find new IBM i talent are discovering that the pool of experienced RPG programmers and operators or administrators with intimate knowledge of the operating system and the applications that run on it is small. This begs the question: How will you manage the platform that supports such a big part of your business? This guide offers strategies and software suggestions to help you plan IT staffing and resources and smooth the transition after your AS/400 talent retires. Read on to learn: