Version Control Basics
A version control
system (or revision control system) is a system that tracks
incremental versions (or revisions) of files and, in some cases,
directories over time. Of course, merely tracking the various
versions of a user's (or group of users') files and directories isn't
very interesting in itself. What makes a version control system
useful is the fact that it allows you to explore the changes which
resulted in each of those versions and facilitates the arbitrary
recall of the same.
The Repository
At the core of the
version control system is a repository, which is the central store of
that system's data. The repository usually stores information in the
form of a file system tree—a hierarchy of files and directories.
Any number of clients connect to the repository, and then read or
write to these files. By writing data, a client makes the information
available to others; by reading data.
The Working Copy
A version control
system's value comes from the fact that it tracks versions of files
and directories, but the rest of the software universe doesn't
operate on “versions of files and directories”. Most software
programs understand how to operate only on a single version of a
specific type of file. So how does a version control user interact
with an abstract—and, often, remote—repository full of multiple
versions of various files in a concrete fashion? How does his or her
word processing software, presentation software,source code editor,
web design software, or some other program—all of which trade in
the currency of simple data files—get access to such files? The
answer is found in the version control construct known as a working
copy.A working copy is, quite literally, a local copy of a particular
version of a user's VCS-managed data upon which that user is free to
work. Working copies 1 appear to other software just as any other
local directory full of files, so those programs don't have to be
“version-control-aware” in order to read from and write to that
data. The task of managing the working copy and communicating changes
made to its contents to and from the repository falls squarely to the
version control system's client software.
The problem of file sharing
All version
control systems have to solve the same fundamental problem: how will
the system allow users to share information, but prevent them from
accidentally stepping on each other's feet? It's all too easy for
users to accidentally overwrite each other's changes in the
repository. Consider the scenario shown in Figure Suppose we have
two coworkers, Harry and Sally. They each decide to edit the same
repository file at the same time. If Harry saves his changes to the
repository first, it's possible that (a few moments later) Sally
could accidentally overwrite them with her own new version of the
file. While Harry's version of the file won't be lost forever
(because the system remembers every change), any changes Harry made
won't be present in Sally's newer version of the file,
because she never saw Harry's changes to begin with. Harry's work is
still effectively lost—or at least missing from the latest version
of the file—and probably by accident. This is definitely a
situation we want to avoid!
The lock-modify-unlock solution
Many version
control systems use a lock-modify-unlock model to address the problem
of many authors clobbering each other's
work. In this
model, the repository allows only one person to change a file at a
time. This exclusivity policy is managed using
locks. Harry must
“lock” a file before he can begin making changes to it. If Harry
has locked a file, Sally cannot also lock it, and therefore cannot
make any changes to that file. All she can do is read the file and
wait for Harry to finish his changes and release his lock. After
Harry unlocks the file, Sally can take her turn by locking and
editing the file. Figure 1.3, “The lock-modify-unlock solution”
demonstrates this simple solution
The copy-modify-merge solution
Subversion, CVS,
and many other version control systems use a copy-modify-merge model
as an alternative to locking. In this
model, each user's
client contacts the project repository and creates a personal working
copy. Users then work simultaneously and Fundamental
Concepts 4independently,
modifying their private copies. Finally, the private copies are
merged together into a new, final version. The version control system
often assists with the merging, but ultimately, a human being is
responsible for making it happen correctly.
Here's an example.
Say that Harry and Sally each create working copies of the same
project, copied from the repository. They work
concurrently and
make changes to the same file A within their copies. Sally saves her
changes to the repository first. When Harry attempts to save
his changes later, the repository informs him that his file A is out
of date. In other words, file A in the repository has somehow
changed since he last copied it. So Harry asks his client to merge
any new changes from the repository into his working copy of file
A. Chances are that Sally's changes don't overlap with his own; once
he has both sets of changes integrated, he saves his working
copy back to the repository.
But what if
Sally's changes do overlap with Harry's changes? What then? This
situation is called a conflict, and it's usually not
much of a problem.
When Harry asks his client to merge the latest repository changes
into his working copy, his copy of file A is
somehow flagged as
being in a state of conflict: he'll be able to see both sets of
conflicting changes and manually choose between
them. Note that
software can't automatically resolve conflicts; only humans are
capable of understanding and making the necessary
intelligent
choices. Once Harry has manually resolved the overlapping
changes—perhaps after a discussion with Sally—he can
Revisions
Each time the
repository accepts a commit, this creates a new state of the
filesystem tree, called a revision. Each revision is assigned a unique
natural number, one greater than the number assigned to the previous
revision. The initial revision of a freshly created repository is
numbered 0 and consists of nothing but an empty root directory.
No comments:
Post a Comment
Don't hesitate to ask query