25
Jul 08

Pull pork with care

I just committed the large giant change to bring down elsa’s namespace pollution to reasonable levels. Elsa code now is now using std::foo style, or using namespace std. As I mentioned before, Elsa’s string is now sm::string, a summary of how to perform similar renames is here. The good news is that Pork will soon work out of the box with a modern toolchain.

For the handful of porkers out there, you need to hg pull & hg up all of the pork repositories. This has been a use-case in why splitting up a codebase into a billion repositories is a bad idea:

a) Lovely, I have to do many commits instead of one

b) To top it off, now my users will curse my name while updating whatever pork repository that interests them most.

I feel like I’m going to throw up if I see any more C++ code diffs in the next 10minutes.

In contrast, while rewriting things on the Mozilla-scale is a lot less feasible manually, it is very rewarding to automate. Gotta love big C++ codebases.


22
Jul 08

Dogfooding pork & OSCON

I wrote a class renamer and used it to fix my pork pet-pieve #1: a class named string that isn’t std::string. This has been a low priority goal for as long as I’ve been using Elsa. It’s pretty cool to apply a tool to fix itself.

The renamer is a 3x simpler than the next simplest tool. I plan to extend it to also rename class members. Renaming is the most trivial use-case for rewriting code, I plan to post a tutorial on usingĀ  the renamer in the near future.

OSCON

If you are at OSCON, you do not want to miss our static analysis session on Wednesday.


18
Jul 08

Pork, MCPP, Oink and Elsa…What’s going on?

It seems that there is some confusion as to what pork is and how it’s related to oink and elsa. So here is my view of it.

Pork is my set of tools that use Elsa to rewrite sourcecode (mainly Mozilla code). Our use of Pork is solely for rewriting as it is not suited for convenient and hardcore analysis needs as much as the GCC based tools are.

MCPP is the secret sauce C preprocessor that makes C++ rewriting with Elsa possible by annotating preprocessed files with information to undo the lexical braindamage resulting from macro expansion.

Elsa is a awesome C++ parser. Awesome in that is can preserve more information regarding parsed code than any other C/C++ parser and it is easy to extend.

We maintain our own version of Elsa within pork.

I think our version of Elsa is the most up to date and most compatible with newer C++ features and headers used by newer GCC releases. We encourage other projects with C++ parsing/rewriting needs to collaborate with us. We will be parsing code with Elsa for a few years to come and it’s a lot of work to maintain a C++ parser by a single entity. I think elsa is a much better backend to build refactoring support onto than any other C++ parsing project out there right now.

The Messy Details

Now lets move on the more confusing parts: oink, oink-stack, and the oink mailing list.

oink consists of some static analysis tools and was meant to be a central place where all of the Elsa and Elsa-related development was supposed to happen. When people refer to oink, they usually mean the oink-stack which is a subversion meta repository that pulls in a dozen of subrepositoes(smbase, elkhound, elsa, oink(where static analysis tools live), etc).

So when I started working on refactoring tools I was told that I should aim to have my tools added to oink, but there were some legal hassles to work out in the meantime so I cloned the oink-stack and developed my tools with minimal changes to oink-stack. This included various elsa extensions, bugfixes, etc.

However, the little momentum that oink had has fizzled out due to various personality conflicts and various academics loosing interest. The code has been bitrotting for as long as I’ve been working at Mozilla.

So the end result of oink is that we have pork which is a superset of oink. I’m not even sure if I mention the name pork anywhere in the sources. So pork at the moment means “Taras’ continuation and extension of oink”. I am using the oink mailing list for any discussion on changes to Elsa/etc in hopes that at least some of the genius lurkers there will regain their interest in elsa.

Where do We Go From Here?

Onward! Due to the original authors vision of what C++ is and the state of C++ at the time Elsa was conceived, current pork code causes people to have many WTF moments (followed by banging head against keyboard) when they first start using it.

The short version of my plan is:

  • allow one to do “using namespace std” when using elsa
  • Restructure pork repositories such that there are only 3 of them rather than 11 (elsa, elkhound, pork)
  • get rid of the oink repository (those tools do not work for us)
  • Make pork only consist of just my tools (with a sane build system) rather than be mixed into unmainted oink stuff
  • Make pork compile with new compilers (GCC 4.3 and recent MSVC++)
  • Keep track of this in a bug
  • Clean up various misc things

Some of you might ask “But Taras, why now, why not just keep doing what you’ve been doing?”. I was doing what I was doing because I had an overwhelming goal of devising a way to automate static analysis and refactoring of Mozilla on my shoulders and I wasn’t convinced that it was feasible. I had to learn to split my time between tool development and actually using the tools. Naturally I cut corners on tool development :)

Since then slowly, but surely various awesome hackers have started doing rewrites and analyses themselves freeing me up to focus more on development. To make matters sweeter, various hackers have started submitting bugreports, fixes, ports to my tools. This gives me more time to focus on the big picture.

Finally, I belive that automation of the sort we are doing at Mozilla is something that has been missing from open source development practices and it will catch on once people realize what they’ve been missing. Reducing those WTF moments will help people think positively.

Continue reading →


09
Jul 08

Static Analysis and Refactoring Tooling Updates

Hydras

I am close to landing a flow check. Turns out, it is super-easy to introduce new analyses into Mozilla due to a very nice build system hooks setup by bsmedberg.

Since coming back from the GCC summit I have forward-ported our GCC patches to GCC trunk. The FSF legal paperwork came through today so I posted the first and biggest patch to the GCC for review.

I am not sure if I mentioned this before, but the C port of Dehydra is somewhat operational. It doesn’t yet have access to function bodies, but type traversal should work. Unfortunately, the C frontend has less features(pretty printing sucks, locations are even less reliable, etc) and thus is less awesome to work with than the C++ frontend.

Pork

jst was awesome enough to list some interfaces that need some outparamdelling. The list is here (in the content/ section). This lead me to spent some time making outparamdel’s output prettier. There are still some improvements to be made, and I will be making them in the near future. However if someone is interested in refactoring of this kind land in the near future, they could easily complete outparamdel’s work with some clever scripting and a bit of manual labour. Sure beats doing the entire thing manually. From outparamdel’s perspective last 10% appear to be slightly painful and might take some time.

Here is a patch that takes about 30seconds to produce.

Another exciting aspect of this is that a certain emacs wizard has confirmed that it would be possible to feed emacs such a patch file and have it correct indentation for the affected areas only.

I am also very excited that a certain volunteer came forward and decided to start improving some of the stomach-turning areas of Pork. Hopefully in the near future we’ll modernize the C++ a little bit and a user’s first reaction wont be: “What the hell, why can’t I do ‘using namespace std;’”.

To this end I have filed a bug to write a renamer tool so we can dogfood renaming of unfortunately named pieces of code.

OSCON

The plan is to have some sort of a minisession on our static analysis efforts at Mozilla. So if you are attending OSCON and are interested in doing exciting things to depressingly large amounts of code, drop me a line.


08
Jul 08

Where is the sanity in the C++ std library?

Dear lazyweb,

Please explain to me why the following code works the way it does. From looking at the following code and stringstream::str(), stringstream::str(string) docs the behavior of the following code does not make sense to me.

#include <sstream>
#include <iostream>

using namespace std;

int main(int argc, char**) {
stringstream ss(“foo”);
cout << ss.str() << endl;
ss << “bar”;
cout << ss.str() << endl;
ss << “more”;
cout << ss.str() << endl;
}

Continue reading →