Posted about 2 hours back at Ruby Inside
Anemone is a free, multi-threaded Ruby web spider framework from Chris Kite, which is useful for collecting information about websites. With Anemone you can write tasks to generate some interesting statistics on a site just by giving it the URL.
Its only dependency is Nokogiri (an HTML and XML parser). Other than that, you just need to install the gem to get started using Anemone's simple syntax which, among other things, allows you to tell it which pages to include (based on regular expressions) or define callbacks.
This example taken from Anemone's homepage prints out the URL of every page on a site:
require 'anemone'
Anemone.crawl("http://www.example.com/") do |anemone|
anemone.on_every_page do |page|
puts page.url
end
end
The bin folder in the project contains some more in-depth examples, including tasks for counting the number of unique pages on a site, the number of pages at a certain depth in a site, or a list of urls encountered. There's also a combined-task which wraps up a few of these, intended to be run as a daily cron job.
You can install Anemone as a gem or get the source from Github of course, and there's some fairly comprehensive RDoc documentation available in the source or online.
Posted about 4 hours back at Pelargir
Ignite Raleigh looks quite interesting. It’s essentilly a conference made up entirely of lightning talks. Voting is now taking place on submitted talks. The top 10 will be given on August 5th.
I submitted a talk titled 3 Secrets to Effective Nomading. Check out the description and, if you feel it’s compelling and would want to hear it, consider voting for it.
And by all means, if you have an idea for a talk, submit it!
Posted about 6 hours back at Jake Scruggs
This summer I'm revisiting my short apprenticeship at Object Mentor. I'll be posting commentary on all my posts from the summer of 2004 exactly 5 years later to the day.
Friday 7-2-04
We did use my code in the wiki conversion program. And while it was good to finish up that project, this meant we had to start working on the State Map Compiler. A program first written in 1993 in C++, then later re-written in Java in '98. No tests, very long methods, and the documentation gives lots of examples in C++ (which I don't know much about -- it seems like Java, but it most definitely is not.) The point of the program is to take in a simple statement of a 'state machine' and output source code that implements the state machine in either Java or C++. We are going to be adding a C# output option to this program.
Now, at this point, you might be wondering: 'What is a state machine?' Well, let me attempt to regurgitate the turnstile example. A turnstile has two states: Locked and Unlocked. If you try to pass through a turnstile when it's locked, you'll get quite a different response than when it is unlocked. A coin will cause a locked turnstile to become unlocked, but won't do much to an unlocked turnstile. A lot of software engineering problems can be thought of a nothing more than a dressed up turnstile with a few more states. So it's handy to have a program that takes in some information about a finite state machine and then outputs the basic code that you can build your implementation around.
Also, it's pretty interesting to be working on a compiler. Software that writes other software (that, ultimately, will be translated into byte code, and then the byte code will be translated by the virtual machine so that the real machine will know what to do -- quite a process).
Not a bad explanation of a state machine for guy who just learned about them that day: Good job distant-past-jake. I think ended up using a state machine pattern in my code submission to ThoughtWorks later that year.
I later worked on a huge java project that would generate something like 15 classes for every one written class. I soon got very tired of generated code ("Hey, why did my changes go away? Oh, it's a generated class... (next line said forlornly) Guess I better go dig through the xml pit for what to change.")
Posted about 6 hours back at Rails Envy - Home
If you want to know if your application can scale before it actually gets the traffic spike, then you need to learn how to do Load Testing. Thankfully I just released Load Testing – Part 2 of the Scaling Rails screencast series. If you haven’t seen the first video on Load Testing, you should probably start there.
Summary
In this second Load Testing Screencast we pickup where we left off with the first load testing screencast and learn how to use httperf load testing with sessions, how to automate our httperf testing using autobench, how to graph the results from autobench, and lastly we talk briefly about a few other load testing tools you might want to be aware of.
Don’t forget to subscribe to the screencast RSS feed or grab it on ITunes to avoid missing any of these episodes. FYI, These videos look great on an iPhone / iPod if you want something to watch on the go.
Posted about 7 hours back at InfoQ Personalized Feed for unregistered user - Register to upgrade!
Ruby on Rails has been around for about 5 years and in those years developers have created a lot of applications. Many of those applications were created while learning Ruby and Ruby on Rails and may not have used the best practices but yet made it into production web sites. These web applications can be problematical but a new book focused on the solution is available.
By Robert Bazinet
Posted about 8 hours back at InfoQ Personalized Feed for unregistered user - Register to upgrade!
Bob Frankston offers a vision of the Internet that focuses on communication and connection uninhibited by artificial barriers like carrier exclusivity, arbitrary differences in protocols, and vendor constraints. He uses stories as his organizing and presentational metaphor to share a vision of what could be, if we had free reign to follow our imagination. By Bob Frankston
Posted about 11 hours back at danwebb.net - Home
Just a quick note for those googling for a solution to this. Older versions of Low Pro for Protoype will be experiencing problems with behaviors in Firefox 3.5. This issue has been fixed so pick up the latest version from GitHub
An interesting upshot of this is that I found out something pretty nice which is kind of obvious but had never occurred to me before. If you create a function in a closure then that function can refer to itself because it has access to that closure’s variables. I normally use arguments.callee to get a reference to a function from within its body but there’s never really any need to do that:
(function() {
var aFunction = function() {
aFunction.thing = 47;
};
aFunction();
alert(aFunction.thing); //=> 47;
})();
Posted about 15 hours back at Git Ready
Stashing is a great way to pause what you’re currently working on and come back to it later. For example, if you working on that awesome, brand new feature but someone just found a bug that you need to fix. Add your changes to the index using
git add .
Or add individual files to the index, your pick. Stash your changes away with:
git stash
And boom! You’re back to your original working state. Got that bug fixed? Bring your work back with:
git stash apply
You can also do multiple layers of stashes, so make sure to use
git stash list
To check out all of your current ones. If you need to apply a stash from deeper in the stack, that’s easy too. Here’s how to apply the second stash you’ve got:
git stash apply stash@{1}
You can also easily apply the top stash on the stack by using (Thanks jamesgolick!):
git stash pop
A note with this command, it deletes that stash for good, while apply does not. You can manually delete stashes with:
git stash drop <id>
Or delete all of the stored stashes with:
git stash clear
Posted about 15 hours back at Git Ready
I frequently need to do this when setting up or syncing my various machines, and I seem to forget the command all the time. So let’s say you’ve got more than one branch on your remote, and you want to bring it down into your local repository as well:

Viewing information on the remote should look something like this:
$ git remote show origin
* remote origin
URL: *************
Remote branch merged with 'git pull'
while on branch master
master
Tracked remote branches
haml master
Luckily, the command syntax for this is quite simple:
git checkout --track -b
<local branch> <remote>/<tracked branch>
So in my case, I used this command:
git checkout --track -b haml origin/haml
You can also use a simpler version:
git checkout -t origin/haml
Posted about 15 hours back at Git Ready
This is a topic that is a constant source of confusion for many git users, basically because there’s more than one way to skin the proverbial cat. Let’s go over some of the basic commands that you’ll need to undo your work.
So, you just want to revert one file back to its original state:
git checkout <file>
One problem with this is that you may have is that a file and branch named the same. Since the checkout command is used for both reverting files and swapping out to a different branch, you’ll need to use this syntax (thanks, Norbauer)
git checkout -- <file>
If you want to throw out all of the changes you’ve been working on, there’s two ways to do that.
git checkout -f or git reset --HARD
Once these commands are run you’ll lose all of the work that isn’t committed in your directory, so make sure to take caution when using them.
Also, be aware that ‘git revert’ is not equivalent to ‘svn revert’! git-revert is used to reverse commits, something another tip will cover in the future.
Posted about 15 hours back at Git Ready
You just committed that awesome feature/test/bug, but something just isn’t right. Either some information isn’t filled out, the commit message is wrong, or something else is just messed up. Let’s go over what can be done to fix the associated data after the fact.
Fixing the previous commit is very simple. Just use
git commit --amend
And that will fire up the commit patch in $EDITOR for you to mess with. Simply edit the message right on the top line of the file, save, quit and you’re done.
Now let’s say you’ve got a few commits to fix, or the commit that is broken is a few commits back. This process is a little more complex, but isn’t too bad. (Thanks to Evan Phoenix for this one.)
Here’s the scenario: the last 3 commits don’t have the right username, email, or perhaps commit message. Let’s start by generating patch files for the commits:
git format-patch HEAD^^^
That will generate files in the form of 0001-commit-message that contains the commit diff and metadata. One note with the 3 caret symbols: just add one for each commit you need to go back, and be consistent! EDIT: You can also refer to past commits with the syntax HEAD~n, so for this example we’d use HEAD~3. Go ahead and edit these files so they have the proper information. Once that’s done, we’ll need to reset our repository back a few commits:
git reset --hard HEAD^^^
Now we can apply each commit to fix the information. Make sure you do it in the right order! (Usually it’s ascending)
git am < 0001...
git am < 0002...
git am < 0003...
Now if you check git log you should see the right information. If for some reason stuff got messed up, doing a
git reset --hard origin/master
will bring you back to your original changes. Once you’ve got the information fixed, you’ll need to do a force push on the repo:
git push -f
If you don’t add the -f, git will complain. WARNING! Just be aware that modifying commit messages may muck up other’s repositories and should be used with caution. In fact, other developers may flat out hate you for doing this. If it’s far back enough it’s probably a bad idea to rewrite history anyway.
If you know of any unexpected or uncovered consequences please let us know in the comments. Another tip in the future will cover fixing the actual committed files and making sure that all repos are up to date.
Posted about 15 hours back at Git Ready
This is a follow up to the comments from yesterday’s article about interactive adding. Readers were begging coverage of powerful git add -p, a shortcut to the patch mode of interactive adding. This command is capable of breaking up changes in files into smaller hunks so you can commit exactly what you want, and not just the whole file.
This behavior is very helpful since it allows you to version any “hunk” of text in the file that you’d like. Perhaps that one line of code will break the build on others’ machines or the build server, but just not yours. You get the point.
Let’s go through an example of using it, making changes to our project’s trusty README file again. Here’s what git diff gives us from the changes that were just made:
diff --git a/README.md b/README.md
index 2556dae..45d8b6e 100644
--- a/README.md
+++ b/README.md
@@ -1,6 +1,14 @@
> There is only one way...
> ~ Henry Van Dyke
+# Project information
+
+Blah blah blah...
+
# Want to contribute?
If you have ideas about...
+
+# About
+
+This blog is just awesome.
So as we can see, we’ve made changes in separate parts of the file. However, I want to only commit the lower half. Besides, who really needs documentation anyway? Enter:
git add -p
Run this and you’ll be greeted with a prompt just like git add -i, but it has quite a few options:
Stage this hunk [y/n/a/d/s/e/?]? ?
y - stage this hunk
n - do not stage this hunk
[... removed a few ...]
s - split the current hunk into smaller hunks
e - manually edit the current hunk
? - print help
Most are pretty self-explanatory, but let"s go over the really cool ones. What we really want is the s option, which will split the file into two smaller hunks of code, one for the higher “Project Information” block and one for the “About” block. Once it"s split it will prompt you with just the changes for that hunk:
Split into 2 hunks.
@@ -1,6 +1,10 @@
> There is only one way...
> ~ Henry Van Dyke
+# Project information
+
+Blah blah blah...
+
# Want to contribute?
If you have ideas about...
Stage this hunk [y/n/a/d/j/J/e/?]? n
From here we can ignore it with n, and move on to the next one and add it with y:
@@ -4,3 +8,7 @@
# Want to contribute?
If you have ideas about...
+
+# About
+
+This blog is just awesome.
Stage this hunk [y/n/a/d/K/e/?]? y
Doing a git status will show changes to be committed and changes to be staged on one file. Strange, but it makes sense: we staged half the changes, and the other half are not yet added to the index.
There’s also the e option, which will let you to manually flip the hunks on and off in your favorite editor. Instructions are included with that option, so go play around with it when you get the chance. (No pun intended.)
If you’ve got any stories as to how git add -p has helped you (or if a neat feature was missed), let us know in the comments!
Posted about 15 hours back at Git Ready
Sometimes simple adding with git add . or git commit -am just isn’t enough. You may want to split changes up over several commits, or you’re just not ready to add everything yet. And who wants to add individual files one at a time? That’s just boring. Enter the interactive adder:
git add -i
Let’s go through an example of using the interactive mode. I made some changes to the gitready project’s readme along with another file. Calling the above command gives us this output, the status of the index:
staged unstaged path
1: unchanged +3/-1 README.md
2: unchanged +1/-1 _layouts/default.html
*** Commands ***
1: status 2: update 3: revert 4: add untracked
5: patch 6: diff 7: quit 8: help
What now>
As you can see, we have plenty of commands available to us. If we want to see this screen again, using the status command will bring us back. The +3/-1 are the number of lines that have been added/removed, the usual plus and minus symbols that you see when pulling.
Let’s add the changes we made to the readme. Using the update command will allow us to do this. After choosing this command we’ll see this output:
staged unstaged path
* 1: unchanged +3/-1 README.md
2: unchanged +1/-1 _layouts/default.html
Update>>
And if we select to update 1, will notify us that the file is staged to be committed. Now if we look at the status, you can see that our readme has been staged properly.
staged unstaged path
1: +3/-1 nothing README.md
2: unchanged +1/-1 _layouts/default.html
Once you’re done, you can dump out using the quit command, and commit your work. If you don’t trust the adder, calling git status will show that only the readme has been staged for the commit:
# On branch master
# Changes to be committed:
#
# modified: README.md
#
# Changed but not updated:
#
# modified: _layouts/default.html
There’s plenty of other helpful commands that the interactive adding can do, this is just the start. Check out the help command for the rest that’s available to you. We’ll have more tips in the future for the other commands available inside interactive mode.
Posted about 15 hours back at Git Ready
So, you want to see your repository in a brand new way. You’re sick of the command line, you need to see some graphs! Pixels! Buttons! Graphics! Dialog boxes! Ok, we get the point.
Our first option is viewing your repo in a browser. This functionality is packaged with most git installs:
git instaweb
This will fire up a server, usually lighttpd, to serve a simple web interface for your repository. You can browse commits, trees, view files, what have you.

This is really useful if you need to dive down and see history but you don’t know the commands yet. (And it’s easier too.) If you don’t have lighttpd installed and don’t want to bother with it, running
git instaweb --httpd webrick
will force WEBrick to serve the page, which will work just as well if Ruby is installed on your system. It also works with Apache, just check instaweb’s manpage to see the supported server and other goodies.
Webpages are great and all, but what if you want to see your commits in a more…graphical manner? Look no further than gitk or gitx then. These programs give you a more dynamic view of your repository and let you actually see branches and merges take place:

Need some graphical git loving? Use gitk with most git installs by running gitk or download gitx for OSX. Windows users, if you know of an equivalent, comment away!
There’s also plenty of git graphical love over at GitHub for every project, be it visualizing the impact individuals have had on the project, times that people have worked on the project, and even more. Definitely play around with various graphs they have to offer if you haven’t yet.


There’s also plenty of other good visualizers over at the GitWiki. If you know of any other good ways to see your repository graphically, let us know in the comments. I’ll continue to update the post with more and better knowledge as it becomes available.
1 2 3 ... 361