Fog Creek Software
Discussion Board




CVS : impossible to use ???

Hi

I'm sorry to make a new thread, but the other is lost in the archives...
the original thread : http://discuss.fogcreek.com/joelonsoftware/default.asp?cmd=show&ixPost=151844&ixReplies=26

I stil do not understand why the threads are not sorted by last answer date ?!?

So my problem using CVS :

I am part of the developpement team of a big e-commerce site (imagine an amazon like)
I want to use a version control system (i'm trying with CVS)  but the main problem is that the site is :
2 000 code files (.cfm)
58 000 images
33 000 other bin files
all theses files are spread in about 5 000 directories. It's 17 GB of data !

I tried to use tortoiseCVS (+cvsNT) with the maximum of exclusions to put only the .cfm files in the repository but :

1/ it's slow (tortoise has to check all the directories to find these files) (10 minutes to update a local copy from the repository, 10 to 30 to commit a whole structure).

2/ to work on "local copies" of the files I need to use another program to duplicate/synchronise the images/binaries... such a program exists and works quite well (insync)... but it means that after a synchro I have codes+images in the same directories ... exactly as the "real" web site is... it's nice to work... but when it's time to commit some changes... I can't commit a complete directory or even the whole checkout directory because of this mess. Having in a directory some files "versionised" and some other not "versionised" seems the begining of big troubles.

I think it's quite impossible to find a correct solution... maybe CVS and it's firends are not usable on a big website !?!

Help really needed

thanks in advance.

Olivier B
Wednesday, June 23, 2004

Why not have all the files in CVS?

Yes a complete checkout is always going to take a long time but after that you'll be checking in/out relatively small numbers of files at a time (or rather you should be).

Or, are you confusing CVS with a software configuration and packaging tool?

CVS is only good for managing source control (where source is whatever you deem it to be, it can be binary).

Simon Lucy
Wednesday, June 23, 2004

Try out subversion:

http://subversion.tigris.org/

From your comments, I think this is the most relevant feature offered by subversion as opposed to CVS:

"# Costs are proportional to change size, not data size

In general, the time required for an Subversion operation is proportional to the size of the changes resulting from that operation, not to the absolute size of the project in which the changes are taking place. This is a property of the Subversion repository model."

There is a TortoiseSVN available, so at the client end it is very similar to CVS.  Don't know about syncing, though.

http://tortoisesvn.tigris.org/

Ged Byrne
Wednesday, June 23, 2004

Subversion is significantly faster than CVS

Egor
Wednesday, June 23, 2004

Who would have guessed that a product named "Tortoise" is slow ?

Jackass
Wednesday, June 23, 2004

Tortoise isn't slow, CVS is.

The Tortoise clients have no GUI of there own.  You use them via the shortcut menus in Explorer.

This is why they are called Tortoise, because they are in the Shell.

Ged Byrne
Wednesday, June 23, 2004


Subversion is the key.  I've been using it at home for my independent project for a couple months.


Now I just need to figure out how to convert all our VSS stuff to SVN at work...

KC
Wednesday, June 23, 2004

Ok, I'm trying to install subversion (seems a bit harder than cvs) but i'm still a bit confused about managing so many files in a configuration manager...

Olivier B
Wednesday, June 23, 2004

You might want to rethink your project structure too:  You could break your thousands of files into separate "projects".  For example, one project includes all of the customer facing sections, one project containins all of the admin sections, one contains common classes, etc.  Then you only need to update/commit the projects you're actually working on, the rest remains slightly out of date.

Then write some scripts with Python/bash/whatever to allow easy updates of combos, so that you can get the customer facing sections and common classes at the same time.  And one script to update everything at 3am right after you go home for the day :)

Anonymous Coward
Wednesday, June 23, 2004

This is maybe not the time...

But haven't you ever considered putting all your images in an image folder, and keeping the source files separate?

Myling
Wednesday, June 23, 2004

Myling : all the files are not mixed together in the same directory, but the structure is very complicated.

for example
"/" contains only code
"/images" only images
but "/examples" is a root folder for examples... mades of html files, images subfolders, ...
"/product-dedicated-site" contains a complete 'small website" dedicated to a particular product...

so after 80 000 files you have so many directories that you can't exclude or include anything manually

Olivier B
Wednesday, June 23, 2004

I am struggling to understand how with dynamic page content, you could need that many unique pages.  Even with a site as large as Amazon, I doubt they would really need 2000 different pages if they were using dynamically generated content efficiently... or am I completely mistaken?
If it is the case that you need 2000 pages, I agree with the previous posts that large chunks of that are probably different projects, i.e. the store app, contact pages, etc.  I would at least start to break up that large codebase into 3-4 smaller pieces for the sake of upload times and clarity of concept.

Devin
Wednesday, June 23, 2004

I was impressed too : 2 000 files... in fact it grows pretty quickly.
For each page you have 2 or 3 modules or includes, the related back-office pages, ...
so you start with 50-100 pages, and 3 years after you have around 1 000 files.
we also have static pages, made by the graphist of the team when he works on small sites dedicated to a product. 30 files per site
add some forums files, herbs and spices... an you reach the 2k !

------------
Subversion under win32 is really harder to install than cvs... it stil doesn't work... the installation process doesn't make the connection with apache 2.x...

Olivier B
Wednesday, June 23, 2004

Olivier,

"1/ it's slow (tortoise has to check all the directories to find these files) (10 minutes to update a local copy from the repository, 10 to 30 to commit a whole structure)."

That's insanely slow -- even for the number of files you have.  Are you running it on a network share?  The overhead of the network might be your problem.

"2/ but it means that after a synchro I have codes+images in the same directories ... [snip]... I can't commit a complete directory or even the whole checkout directory because of this mess."

Yes, your website structure does not work well for CVS.  In fact, your website structure seems to be mess in general (code & images in the same directory?!?).  I believe you can configure CVS (tortoise) completely ignore some file types.

"maybe CVS and it's firends are not usable on a big website !?!"

I use CVS for several big websites.  In fact, it would be impossible for me to manage all the big websites without it.  They all share a big common framework that is kept in sync with CVS.

Almost Anonymous
Wednesday, June 23, 2004

Finaly I made it ... subversion is running via apache.
I'm adding the files to the repository...

the subversion server is a p3 600, 128MB of ram, Win XP, 20 GB of hard disk (yes an old station...)

The repository is on the hard drive of the subversion server.

The developpement files (for the first "Add") are on the developpement server (lan 100 mbps between the 2 servers)

Olivier B
Wednesday, June 23, 2004

Installing Subversion on Win32 is HARDER than CVSNT, but it's not HARD. The directions are spelled out quite explicitly. It generally takes me about 10 minutes to get everything running.

Brad Wilson (dotnetguy.techieswithcats.com)
Wednesday, June 23, 2004

*  Recent Topics

*  Fog Creek Home