¿What is actually the Working directory in Git? [closed] - java
Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 2 years ago.
Improve this question
I am spending lots of time to get a clear idea about the 'Working directory in Git'
Is it a especific folder or directory? or is a version of a directory? Can anyone help me to understand this concept.
What if I create a directory 'mydir' locally
then I run: git init.
thanks
In Git, the phrase working directory was once a synonym for working tree. It isn't any longer, because the phrase working directory may also be used by your OS (usually with a third word in front, as current working directory). Modern Git tries to use the phrase working tree as much as possible, though this is sometimes shortened to work-tree or worktree, as in git worktree add for instance.
In your OS, when they use the phrase current working directory, this refers to the folder or directory1 you are working in at the time. That may be within your working tree.
In Git, the phrase working tree refers to the OS-maintained directories-and-files that hold your copies of files. These are yours, to deal with as you wish: Git simply fills them in from committed files.
What if I create a directory 'mydir' locally then I [run]: git init
Let me rephrase this as the following series of shell commands:
$ mkdir mydir
$ cd mydir
$ git init
The mkdir creates a new, empty directory, within your current working directory. The cd then enters this empty directory, so that now what was ./mydir is your current working directory. The git init command runs with its own current working directory being this empty directory.
Since the directory mydir was empty at the time you ran git init, Git will create a hidden directory / folder named .git within this mydir directory. This hidden directory contains the repository proper. The repository consists of a number of files and directories that implement several databases:
One database is a simple key-value store that uses hash IDs to locate internal Git objects. This is the main (and usually largest) of the two primary databases that make up a Git repository.
One database is another simple key-value store that uses names as keys, to store hash IDs, which are then used in the first database. This is the secondary database that makes up a Git repository. This particular database's implementation in current versions of Git tends to be a bit dodgy: it relies too much on your operating system. On macOS and Windows, it tends to be a bit flawed. There is ongoing work in Git to replace this with a proper database implementation, which will eliminate this problem.
Apart from these two main databases, the repository contains many auxiliary files, including Git's index (aka staging area). The most important point here is that all of these entities live within the .git directory, though.
As there are no commits yet, both main databases are empty. At this point, so is Git's index.
Your work-tree consists of all files and directories inside your current working directory except the .git directory, which holds Git's files. Since your work-tree is yours, and is maintained by your OS (not by Git), you can now create any files you like here.
At some point, you will want to have Git create a new commit. This will be the very first commit in the repository. To create this commit, you will add the files you would like to go into this initial commit, into Git's index / staging-area, using git add. The git add program works by copying your work-tree files into Git's index. So, with your OS's current working directory being the mydir directory, you can now just create some file(s):
$ echo "repository for project X" > README
$ git add README
$ git commit
The echo command here creates a new file named README in your working tree. The git add command takes the working tree file, compresses and Git-ifies it to make it ready to be stored in a new commit, and writes the stored file into Git's index.2 The final command, git commit, gathers some metadata from you—the person making the commit—and writes out Git's index and this metadata, storing the results in the main database, to create a new commit.
Once you've made this new, initial commit—the very first commit in the repository—it becomes possible for branch names to exist. They cannot exist until this point because each branch name must hold a valid, existing hash ID, and hash IDs for future commits are not predictable.3 Now that there is one commit, that's the only hash ID that any branch name can hold.4
Over time, you will add more and more commits to the repository. (In general, it's pretty rare to ever drop a commit, except for, e.g., the way git rebase replaces commits with new-and-improved ones. It's not impossible, it is just difficult.) Each new commit therefore adds to the repository.
The repository itself, then, consists of:
the databases that hold commits and other objects, and the names that find them;
Git's index, used to hold your proposed next commit; and
other maintenance items that you and/or Git may find useful.
The commit objects, and in fact all objects in the big database, are strictly read-only. Nothing and no one can ever change them. They're in a form that is directly useful only to Git itself, though.
Cloning the repository consists of copying the two databases, although the names database is only partly copied, and gets changed during the cloning process.
Meanwhile, your working tree is where you have Git extract commits, turning stuff that's only directly useful to Git—and that is read-only—into stuff you can work with and modify. These are your files. This is how you do your work, in your working tree. You can use the results to update Git's index, and then use Git's index to create a new commit, that adds on to the repository without changing anything that already exists in the repository.
1At the OS level, the terms folder and directory are synonyms. Git itself does not store folders or directories: it just stores files whose names may contain embedded slashes, such as path/to/file.ext. That's all one single file name. Your OS may force you to first make a folder named path, then in that folder, make a folder named to, and only then use the combined path and to folders to make a file named file.ext within that path. The current working directory can be changed to path, so that you would use the name to/file.ext, instead of path/to/file.ext, or even to path/to so that you would use the name file.ext. In all cases, Git will internally work with a stored file named path/to/file.ext. So your current working directory is an OS concept, referring to how you move around within the folders that your OS maintains.
2Technically, the index doesn't actually hold the files directly. It holds instead a Git blob object hash ID for the file, which provides the key to the key-value object database so that Git can look up the file's content, plus the name of the file—complete with (forward) slashes—and some additional information. The blob object holds a compressed and de-duplicated copy of the file's content.
This de-duplication, and the fact that it is git add that readies the file for committing, means that git commit will go quite fast, as it need not prepare anything for committing: it just saves, permanently, the blob objects already stored in the index.
3The hash ID of a commit is a cryptographic checksum of the commit's complete content. The content include not only the saved source files (as an internal Git tree object), but also the exact date-and-time-stamp. Since we don't even know what you'll commit in the future, much less exactly when you will commit it, we cannot compute what the future hash ID will be. You may know what you will commit, which gets you closer; but unless you know exactly when you will commit it, you won't know the hash ID either.
4Branch names in particular are constrained: they may only hold a commit hash ID. Tag names can hold the hash ID of any of Git's four internal object types. (Usually, though, a tag name either holds a commit hash ID, or the hash ID of a newly-created annotated tag object, which in turn holds a commit hash ID.) Other types of names may have their own constraints.
Related
Git manage environment specific configuration
I have a requirement to have a property configuration for different environment like dev, uat and production. For example a config.properties having and entry like environment=dev, this I need to change for staging branch as environment=uat and for master branch as environment=prd . I tried to commit these files in each branch respectively and tried adding config.properties in gitignore so that it will not consider in next commits. But git ignore not getting updated so I ran command git rm -rf --cached src/config.properties git add src/config.properties git commit -m ".gitignore fix" But this command is deleting the file from local repository itself and the proceeding commits also deleting from branches. I want to handle the branch as such so as Jenkins will do the deployment without editing config file manually. I am using fork for git UI. Is there any way to handle this kind of situation?
You should not version a config.properties (git rm is right), and ignore it indeed. That way, it won't pose any issue during merge. It is easier to have three separate files, one per environment: config.properties.dev config.properties.uat config.properties.prd In each branch, you would then generate config.properties, with the right value in it, from one of those files, depending on the current execution environment. Since you have separate branches per environment, with the right file in it, you can have a generation script which will determine the name of the checked out branch with: branch=$(git rev-parse --symbolic --abbrev-ref HEAD) That means you could: version only a template file config.properties.<env> version value files named after the branches: config.properties.dev, config.properties.uat...: since they are different, there is no merge issue when merging or switching branches. Finally, you would register (in a .gitattributes declaration) a content filter driver. (image from "Customizing Git - Git Attributes", from "Pro Git book") The smudge script, associated to the template file (package.json.tpl), would generate (automatically, on git checkout) the actual config.properties file by looking values in the right config.properties.<env> value file. The generated actual config.properties file remains ignored (by the .gitignore). See a complete example at "git smudge/clean filter between branches".
compile a package with alle changed Java classes
I have build a set of Java classes that act as kind of plugins in a third party application. When ever new request come on my table I create new classes (plugins) or modify existing one. To make the changes available to the third party application I can put a Jar into an so called extlib directory or put single class files into an so called ext directory. I am looking for a proper way to handle different versions of my files. When changing only one single class it a bad idea to replace all class files in ext dir. Same problem when compiling as JAR. After changing one single class, I would have to compile q whole new JAR with all files inside. Replacing all files inherits the chance of accidentely replacing a untested change. Do you have any hints / best practices of how I could manage the different file versions? My Ideas: Some kind of patch would be great. When changing some files, I just push a button to compile a zip archive with all changed files inside. In optimal case with a version mark in all the files. Would something like this be possible with eclipse plugins pxe.?
There exist a number of workflows for this. The terminology varies a bit depending on the version control system that you are using; I am going to use git terminology. One workflow is to always work on a branch, and to never merge a branch that has not been thoroughly tested into master. So, then, a release is only made from master. Another workflow is to work on whatever branch you want, merge into master whenever you want, then every once in a while pull from master into a designated "release" workspace, do your testing there, and then release from that. As for binary patching, I am sure there exist tools out there, but I do not know of one, I asked a few people and they don't know either, so I have no answer for you here. I suppose if you have .class files you can use some folder synchronization tool, but if you have .jar files then you are going to be replacing them in their entirety.
Intellij - Git status shows files have been changed when they have not
I am working on a Java project in Intellij that uses git. Quite a few files are blue (to show that changes have been made), however when I right click them and click on "Git -> Compare with Latest Repository Version" it says that the contents are identical. Anyone know why this happens? It only seems to happen to files that I've opened to look at but haven't changed. Could it happen if I accidentally added extra white space and then deleted it or something? Or just extra whitespace in general?
This is how GIT is different from SVN. GIT's change detection algorithm does not depend only on the content of the file but the meta data (timestamp last modified, etc) of the file as well. So even if you are adding just one space and removing it later on; if you save it, it modifies the metadata of the file. For more details, you can have a look at: What algorithm does git use to detect changes on your working tree?
When do I commit when moving files in a git? (Jgit)
I am implementing a bot that performs scheduled backups. from a front-end a user will be able to change the folder names the backups are stored in. according to: What's the purpose of git-mv? mv oldname newname git add newname git rm oldname is what I want to do when a folder or file name is to be changed. so I move the files using Java FileUtils, add the new file/folder and remove the old file/folder using: git.add().addFilepattern(newName).call(); git.rm().addFilepattern(oldName).call(); git.commit().setAll(true).setMessage("Renamed group "+oldName+ " to " +newName).call(); The main goal being: to preserve the history of the files being moved. Should I commit after adding the 'new' file before removing the 'old'? Is my current order of operations fine and committing after both operations should preserve the change history? I am still new to Git and how the logging works, in TortoiseGit it shows files added and removed, would it show up as a move in the log if the process worked? Thank you for your time.
Git does not actually record history of individual files in the repository; it records the history of the entire repository as a single unit. There's nothing in a commit that explicitly says that the foo.txt in revision 2 is a continuation of the bar.txt in revision 1. Instead, renames are inferred by tools that examine the repository — after the changes have been committed — using the heuristic that if a commit removes a file and also creates another file with similar contents, the old file was renamed to the new one. This heuristic only recognizes a rename if both changes occur in the same commit. If you remove a file, commit, then add the file back with a different name and commit again, Git will see that as separate deletion and addition of unrelated files. Note that rename detection is optional and tools may not do it by default. With git log you need to use the -M option, for example, or do git config --bool diff.renames true.
I'm not familiar with JGit, but your Java code should probably mirror what Git is actually doing beneath the interface when you run your command. Since you are already doing this, I don't see any problem. I would make sure that the entire renaming operation appears in a single commit. There are several reasons for wanting to do this. You may want to revert the renaming at some point. If you have a single commit, it would be easy to do this via git revert. With regard to preserving the history, renaming a file makes it harder to track the history, but not impossible, e.g. git log --follow ./path/to/file
Including Source in Archive File
I was considering including the source code in my archive file (EAR, JAR, WAR) so we could see what the deployed application looks like. This will obviously make the archive much bigger. Does the size of the archive file affect performance on the application server at all? Is this a good idea or not?
Here's another solution to your problem which can come instead (of putting the sources in the .jar) or in addition to it: Specify the source control revision id to the .jar. You can specify it in the manifest file or in a properties file. The source control revision id is the current id associated with the root of your project on the source control system. It is readily available in SVN, Git and most modern source control systems. In older systems (CVS) you must first create a (named) tag probably by using current date and time (to ensure uniqueness). The revision number will let you to retrieve from the source control the exact snapshot from which the archive was obtained, so when you fix bugs you'll be fixing them on the correct sources. This technique will allow you to save space - you won't need to ship the whole source directory. However, it is still a good idea to specify this number even if the sources are included in the archive simply because it unanimously specifies the point in history where the archive was created. Manually comparing the sources in the archive with those in the source control is a pain.
It does affect the performance to the extent that there are more entries in the archive's index but that won't be too bad and you'll put the source in its own subdirectory. Or you can just make a source archive and shovel it around with the other files. That would be my choice. Of course if this is a GPL distribution, you'll have to be explicit about where the source is located.