You are not logged in.

#1 2012-05-08 21:31:14

jae
Member
Registered: 2012-05-08
Posts: 1

Git repository question

I maintain a number of independent web sites with local copies of the directory trees on the same PC. I am planning to use Git to place these sites under revision control.  As a new Git user, I would like some advice from those with more experience.

Most of the web sites have trees with a number of relatively short source files (html/css/php/perl/etc.) and two directories under the root which contain large binary files: /images and /documents.  The files in these two directories are modified much less frequently than the other site files.

(1) My thinking is to create a Git repository for each web site.  This seems to make the most sense as the updates to each site are completely independent of each other.  Are there any reasons I might not want to do so?

(2) My understanding is that each Git commit saves the entire content of the managed file set in the repository. Thus, if I were to have the entire web site file structure under revision control, each time I save changes to the source files, all the binary files would be saved as well: the result being many redundant copies of these files.

I could exclude the directories containing the binary files from the managed file set; however, I would like to manage the changes to the directories somehow - albeit in a more efficient manner.

I am considering creating 3 repositories for each web site: (1) / (excluding /images and /documents); (2) /images; (3) /documents.  While a bit clumsy, the changes to the latter are infrequent enough that it should be manageable.

Does anyone know of any problems that might result from such an organization, such as Git getting massively confused?

(3) Is there a better tool to use for revision control under these circumstances than Git?

Some background that may help in answering (3):

The primary reason for choosing Git was the general consensus (at least in the sources I checked) that it managing branches is much easier than with the other options, such as SVN.  I am planning major changes to some of the sites, so managing branches relatively painlessly is a big win.  For now, all the repositories will only be accessed by one user from one system. In the future I may access them from several local systems, but this isn't a requirement.

For those familiar with DEC/HP CMS (for OpenVMS), it would do quite well for this environment as the files in a library are managed individually. But, alas, it runs on the wrong operating system.

Thanks in advance for your assistance.

Offline

#2 2012-05-08 22:00:51

fsckd
Forum Fellow
Registered: 2009-06-15
Posts: 4,173

Re: Git repository question

jae wrote:

(2) My understanding is that each Git commit saves the entire content of the managed file set in the repository. Thus, if I were to have the entire web site file structure under revision control, each time I save changes to the source files, all the binary files would be saved as well: the result being many redundant copies of these files.

Git keeps only one copy. Take a look at Pro Git which has a very nice description of the basics.


aur S & M :: forum rules :: Community Ethos
Resources for Women, POC, LGBT*, and allies

Offline

#3 2012-05-08 22:29:36

Xyne
Forum Fellow
Registered: 2008-08-03
Posts: 6,965
Website

Re: Git repository question

1) One Git repo per site is the most logical approach. They are separate and so should their repos be. It will also make it easy to add or remove sites later.

2) My understanding is that Git tracks changes. If every file were duplicated with every commit, Git repos would be impossibly huge. If you change a source file, only data related to that change is stored with that commit. Each Git repo does contain a full set of the managed files along with all changes, but it's basically just the original files and a series of patches to convert them to the latest version (along with any committed branches).

Of course, if you "delete" a file, it's still there in the history. You can edit the history to completely remove it, but that may create problems if others have cloned the repo and want to merge anything later.

3) I don't know. If you want to use branching to manage major updates, Git seems like the right way to go.

Disclaimer: I still haven't really started using Git, so I may have no idea what I'm talking about.


My Arch Linux StuffForum EtiquetteCommunity Ethos - Arch is not for everyone

Offline

#4 2012-05-08 23:15:26

fukawi2
Ex-Administratorino
From: .vic.au
Registered: 2007-09-28
Posts: 6,237
Website

Re: Git repository question

jae wrote:

(1) My thinking is to create a Git repository for each web site.  This seems to make the most sense as the updates to each site are completely independent of each other.  Are there any reasons I might not want to do so?

One git repo per website makes sense.

jae wrote:

(2) My understanding is that each Git commit saves the entire content of the managed file set in the repository. Thus, if I were to have the entire web site file structure under revision control, each time I save changes to the source files, all the binary files would be saved as well: the result being many redundant copies of these files.

git tracks changes, and it is very efficient at doing so.

jae wrote:

(3) Is there a better tool to use for revision control under these circumstances than Git?

You might like to read over this... It's still git, but a pretty cool way to use it for managing a website:
http://toroid.org/ams/git-website-howto

Offline

Board footer

Powered by FluxBB