Recently by Joe Ferner

I've been doing some programming in node.js and needed a way to parse network packets. node-pcap just wasn't cutting it anymore so I figured why not use the best tool for the job, Wireshark. Under the covers Wireshark uses libwireshark. In fact libwireshark is also used by tshark and rawshark to dissect network packets. When you download the source for Wireshark you won't find a libwireshark directory, what you will find is an epan directory. This directory contains most of what you need to dissect packets.

All of the code for this blog can be found in some shape or form in my node project Nodeshark. This blog does not cover filters or other advanced features but you can always do what I did and study the Wireshark, tshark, or rawshark code. I'm going to try and preset what I believe to be the minimum amount of code to get packet dissection working.

The first thing you will need to do is initialize epan.

#include <config.h>
#include <epan/epan.h>

// setup permissions
init_process_policies();

// initialize GLib. GLib is used by Wireshark under the covers.
GLogLevelFlags log_flags = (GLogLevelFlags)(
  G_LOG_LEVEL_ERROR
  | G_LOG_LEVEL_CRITICAL
  | G_LOG_LEVEL_WARNING
  | G_LOG_LEVEL_MESSAGE
  | G_LOG_LEVEL_INFO
  | G_LOG_LEVEL_DEBUG
  | G_LOG_FLAG_FATAL
  | G_LOG_FLAG_RECURSION);

g_log_set_handler(NULL, log_flags, tshark_log_handler, NULL);
g_log_set_handler(
  LOG_DOMAIN_CAPTURE_CHILD,
  log_flags,
  tshark_log_handler,
  NULL);

// initialize timestamp info
timestamp_set_type(TS_RELATIVE);
timestamp_set_precision(TS_PREC_AUTO);
timestamp_set_seconds_type(TS_SECONDS_DEFAULT);

// initialize epan
epan_init(
  register_all_protocols,
  register_all_protocol_handoffs,
  NULL,
  NULL,
  failureMessage,
  openFailureMessage,
  readFailureMessage,
  writeFailureMessage);

// load all the modules
prefs_register_modules();

// set the locale
setlocale(LC_ALL, "");

// Cleanup all data structures used for dissection. I know
// we haven't done any dissection yet but epan complains
// if this isn't called.
cleanup_dissection();

// Initialize all data structures used for dissection.
// Magical epan function that initializes global variables.
init_dissection();

// functions to log epan_init errors
void openFailureMessage(
  const char *filename,
  int err,
  gboolean for_writing) {
  fprintf(stderr, "filename: %s, err: %d\n", filename, err);
}

void failureMessage(const char *msg_format, va_list ap) {
  vfprintf(stderr, msg_format, ap);
  fprintf(stderr, "\n");
}

void readFailureMessage(const char *filename, int err) {
  fprintf(
    stderr,
    "An error occurred while reading from the file \"%s\": %s.",
    filename,
    g_strerror(err));
}

void writeFailureMessage(
  const char *filename,
  int err) {
  fprintf(
    stderr,
    "An error occurred while writing to the file \"%s\": %s.",
    filename,
    g_strerror(err));
}

After initialization you probably want to open a scope for your packets. Scoping your packets allows Wireshark to do dissection, such as HTTP, across packets. In Wireshark the scope is called a capture file. Here is how we initialize it.

capture_file cfile;
guint32 cum_bytes;
gint64 data_offset;

cap_file_init(&cfile);

// This will load or not load dissectors based on your
// wireshark preferences.
char *gpf_path, *pf_path;
int gpf_open_errno, gpf_read_errno;
int pf_open_errno, pf_read_errno;
e_prefs *prefs = read_prefs(
  &gpf_open_errno,
  &gpf_read_errno,
  &gpf_path,
  &pf_open_errno,
  &pf_read_errno,
  &pf_path);

// Build the column format array. I beleive this holds
// all the columns that Wireshark may return
build_column_format_array(&cfile.cinfo, prefs->num_cols, TRUE);

cfile.wth = NULL;
cfile.f_datalen = 0;
cfile.filename = g_strdup(""); // don't care about the filename
cfile.is_tempfile = FALSE;
cfile.user_saved = FALSE;
cfile.cd_t = WTAP_FILE_UNKNOWN;
cfile.count = 0;
cfile.drops_known = FALSE;
cfile.drops = 0;
cfile.has_snap = FALSE;
cfile.snap = WTAP_MAX_PACKET_SIZE;
nstime_set_zero(&cfile.elapsed_time);

// set the frame type. This will tell Wireshark
// what the top level frame type is.
int encap = wtap_pcap_encap_to_wtap_encap(1 /* ETHERNET */);

// clear the timestamps
nstime_t first_ts;
nstime_t prev_dis_ts;
nstime_t prev_cap_ts;
nstime_set_unset(&first_ts);
nstime_set_unset(&prev_dis_ts);
nstime_set_unset(&prev_cap_ts);

Now we can dissect packets. To do this we initialize a frame, run the packet through epan, and cleanup.

struct wtap_pkthdr whdr;
whdr.pkt_encap = encap;
whdr.ts.secs = 0;
whdr.ts.nsecs = 0;
whdr.caplen = packetLength;
whdr.len = packetLength;

frame_data fdata;
epan_dissect_t edt;

cfile.count++; // increment the packet count

frame_data_init(
  &fdata,
  cfile.count,
  &whdr,
  data_offset,
  cum_bytes);
epan_dissect_init(
  &edt,
  TRUE,
  TRUE /* dissect the whole tree */);
frame_data_set_before_dissect(
  &fdata,
  &cfile.elapsed_time,
  &first_ts,
  &prev_dis_ts,
  &prev_cap_ts);

// run the dissection on "data"
epan_dissect_run(
  &edt,
  &cfile.pseudo_header,
  data,
  &fdata,
  &cfile.cinfo);

frame_data_set_after_dissect(
  &fdata,
  &cum_bytes,
  &prev_dis_ts);
data_offset += whdr.caplen;

// process packet information

// clean up
epan_dissect_cleanup(&edt);
frame_data_cleanup(&fdata);

This isn't all that useful unless we do something with the packet data. Wireshark returns the data, pretty much just like you see it in the Wireshark GUI, as a tree. Here are some of the things you can do with the data.

// iterate the current nodes children
void iteratorFunction(proto_node *node, gpointer data) {
  // node = child node
  // data = the data you passed to proto_tree_children_foreach
  field_info *fi = PNODE_FINFO(node);

  // some nodes don't have field_info. You can still
  // iterate them if you want though
  if(fi == NULL) return;

  fi->length; // size of data in packet

  int posInPacket;
  if (node->parent && node->parent->finfo
    && (fi->start < node->parent->finfo->start)) {
    posInPacket = node->parent->finfo->start + fi->start;
  } else {
    posInPacket = fi->start;
  }

  // abbreviation of node. This is the string you'll
  // see in display filters such as "tcp.srcport"
  fi->hfinfo->abbrev;
  
  // This is the string that you see in the Wireshark GUI,
  // not including the value
  fi->rep->representation;

  // This is the value string you see in the GUI
  char *showString =
    proto_construct_match_selected_string(fi, &edt);
}

proto_node *node = edt.tree; // grab the top level tree node

// data can be anything you want it just gets
// forwarded on to your iterator function.
proto_tree_children_foreach(node, iteratorFunction, &data);


Unlike many ORM solutions MS Entity Framework does not lazy fetch accessed tables. Coming from LINQ to SQL I didn't like this feature at first because it meant adding a bunch of loosely typed include statements to every SQL call. But I quickly realized the performance benefits, not to mention theN+1 problem just goes away. But, I still had heartburn over the loose typing of the include statement, so I came up with my own include. But first to set the stage let me show you what you currently have to write.

  ctx.Users.Include("Order.Item");

This will fetch all Users as well as all the items that they ordered.

To fix this we need to add some method to Users which is of type ObjectQuery. To do that we will use extension methods to add my own include to ObjectQuery. Here is the code for that.

public static class ObjectQueryExtensionMethods {
  public static ObjectQuery<T> Include<T>(this ObjectQuery<T> query, Expression<Func<T, object>> exp) {
    Expression body = exp.Body;
    MemberExpression memberExpression = (MemberExpression)exp.Body;
    string path = GetIncludePath(memberExpression);
    return query.Include(path);
  }

  private static string GetIncludePath(MemberExpression memberExpression) {
    string path = "";
    if (memberExpression.Expression is MemberExpression) {
      path = GetIncludePath((MemberExpression)memberExpression.Expression) + ".";
    }
    PropertyInfo propertyInfo = (PropertyInfo)memberExpression.Member;
    return path + propertyInfo.Name;
  }
}

Using C# expressions and extension methods we now can write this

  ctx.Users.Include(u => u.Order.Item);

Refactor and type-safe

Entity Framework still has it's annoyances, but hopefully this will make it a little less painful. Now if it would only throw an exception if you tried to access a non-included entity instead of just returning null.


I had a requirement to spell check some fields on a custom application page and I knew SharePoint had the ability because on the edit item page there is a "Spelling" button. Come to find out this is one of the easiest things to do and there is really no excuse to not include it on all of your application pages.

First step is to include the javascript needed to run the spell checking. You'll need two includes form.js and SpellCheckEntirePage.js.

<script type="text/javascript" language="javascript" 
  src="/_layouts/1033/form.js?rev=df60y6YolDjUVbi91%2BZw%2Fg%3D%3D"></script>
<script type="text/javascript" language="javascript"
  src="/_layouts/1033/SpellCheckEntirePage.js?rev=zYQ05cOj5Dk74UkTZzEIRw%3D%3D"></script>

When I saw SpellCheckEntirePage.js for the first time I had to laugh because my initial estimate of the task was 3-4 days, I ended up doing it in less than 4 hours.

Next step is to add the button to actually check the spelling.

<input type="button" value="Spell Check"
  onclick="javascript:SpellCheckEntirePage('<%= SPContext.Current.Web.Url %>/_vti_bin/SpellCheck.asmx', '<%= SPContext.Current.Web.Url %>/_layouts/SpellChecker.aspx');" />

Done!

Well almost. There were a couple of fields on the form which didn't make sense to spell check. But looking at the source of SpellCheckEntirePage.js you can quickly find the solution. Just add excludeFromSpellCheck="true" to the fields you don't want to check.

OK, now I'm done.

Well not quite yet. The "excludeFromSpellCheck" doesn't work on People pickers. But SharePoint has this problem too. If you edit a list item with a people picker and run the spell checker it will try to spell check people's login names which is never going to work. I went ahead and added a method to my master page which turns spell check off for people picker fields. It fixed the edit list item spell checking problem too :). I do have to warn you I suck at javascript so if anyone can send me a better way of doing this I would appreciate it.

function disableSpellCheckOnPeoplePickers() {
  var elements = document.body.getElementsByTagName("*");
  for (index = 0; index < elements.length; index++) {
    if (elements[index].tagName == "INPUT"
        && elements[index].parentNode
        && elements[index].parentNode.tagName == "SPAN") {
      var elem = elements[index];
      if (elem.parentNode.getAttribute("NoMatchesText") != "") {
        disableSpellCheckOnPeoplePickersAllChildren(elem.parentNode);
      }
    }
  }
}

function disableSpellCheckOnPeoplePickersAllChildren(elem) {
  try {
    elem.setAttribute("excludeFromSpellCheck", "true");
    for (var i = 0; i < elem.childNodes.length; i++) {
      disableSpellCheckOnPeoplePickersAllChildren(elem.childNodes[i]);
    }
  } catch (e) {
  }
}

I don't usually blog about subjective things like my likes and dislikes, and I usually like (some will say I am in love with) Microsoft, but TFS is just horrible.

Here is a list of why I don't like it, in no particular order:

  • CodePlex -- First there was the SVN bridge (yes, someone hated TFS enough to make a product to make it look like something else) then Microsoft just caved and supported SVN directly. Not really a problem with TFS, it's just an indication that others feel the same way as I do.
  • Size -- I guess Microsoft doesn't really like it either since they don't even ship it with Visual Studios, if you do need it, its a separate download and a big one at that -- 200MB+. Subversion, Tortoise SVN, and Ankh on the other hand combined are less than 15MB. How they filled up 200MB I have no idea, maybe they installed a client that doesn't suck somewhere I don't know about.
  • Read-only Files -- Every file on your system is read-only. Why does everything need to be read-only. If I'm in another editor and I want to make a change to a file, I need to switch over to VS and check out a file for edit. This is rediculous.
  • Identical Files -- I just want to see what changed. Why does TFS insist upon showing me all the files even if they are identical. Call me crazy but I like to see what files I actually changed when I check in. Someone told me there is some command line tool to see this but that's just silly.
  • Code reviews/update -- On my current project I'm the project lead and I like to review the junior developers code when I update my code. Yeah, I can't do that, or I haven't found the button yet. Nor can I find the button to view a particular revision without finding a file that was part of that change and viewing it's history.
  • Deleted files -- Team Explorer doesn't show them by default. This took me a while to figure this one out. You need to go into VS options then find TFS and then click show deleted files. Why is this not on by default and why isn't there a button to toggle this setting in Team Explorer.
  • Project file modification -- We have some people working inside a VPN and some out, the TFS server name is different depending on where you sit. Since the solution file stores the connection string to TFS it's constantly getting checked in.
  • Everything needs to be in the solution -- If you have a bunch of support files or non-.NET files, all of them need to be added to the solution or they don't get checked in or out. VS doesn't help you with this either because you can't have a solution folder map to a directory.
  • Integrating with non-.NET developers -- If you have a mixed development environment (Java, Ruby, etc) like we do. You need to run TFS and something else because all the non-.NET developers can't use TFS especially if they are on a Mac. Usually the something else is much better anyway so you might as well use that instead.
  • Working offline/Speed -- If you ever travel relax and take a nap because you won't be developing. Everything you do talks to the server and that in turn makes it slow. Open up a file and start making changes, to the server you go. Move a file, to the server you go. Diff a change you made, to the server you go.
  • Workspaces -- Who ever came up with the concept of workspaces at Microsoft should be ashamed of him/herself. They are just a stupid idea. If you want two copies of the same project checked out (I do it with SVN if I'm working on a quick bug fix and a big feature at the same time) good luck because I can't figure it out.
  • Server isn't free -- 'Nuff said.

I'm sure there are other problems but this is all I could think of while writing this.

Sure there are some nice things about TFS, but honestly the cons far outweigh the pros. So if you have to decide which SCM to use and you run across this blog I think it's pretty obvious that I don't recommend it. I love SVN and as GIT (tools) mature I'm sure I will switch to that.


I had a problem with creating and using tasks in a SharePoint state machine workflow so I wanted to capture it in a blog so that others wouldn't need to. There isn't a lot of magic just a few gotchas which I'll point out.

Let me start by showing an overview of the workflow we are trying to create.
workflow_overview.jpg
This is about as simple as it gets. Start -> Manager Approve -> Done.

If you look at "ManagerReview" you see we have three things.

  • ManagerReviewInit - This activity is responsible for creating the task
    Gotcha #1 - If we create the task here we encapsulate the ability that later in the workflow we can transition back to the ManagerReview state and it will know how to create a new task.
  • OnManagerReviewTaskCreatedEvent - This activity is going to initialize the task
  • OnManagerReviewTaskChangedEvent - This activity is going to perform the logic of when the task is completed

ManagerReviewInit

manager_review_init.jpg
No surprises here. Unless you take a look at the CorrelationToken on the create task.
create_manager_review_task_props.jpg
Gotcha #2 - The OwnerActivityName needs to be scoped at the "State" level (in our case "ManagerReview"). This is important because later if you add a state that transitions back to "ManagerReview" you will get an exception stating that the correlation token was already initialized. Narrowing the scope will invalidate the correlation token when you leave the state.

OnManagerReviewTaskCreated

manager_review_on_create.jpg
Gotcha #3 - At first I assumed that you needed to have a "set state" at the end of every event to loop back onto itself. Well you don't, there is an implied loop back. In fact if you do, the state initialization routine will be called causing a task to be created which will basically create an infinite recursion.

OnManagerReviewTaskChanged

manager_review_on_change.jpg
If we take a closer look at the conditional you'll see I'm using a property I created, TaskComplete.
complete_rule.jpg
Gotcha #4 - Microsoft doesn't expose the task status and since most users don't update percent complete when completing tasks you have to attach to the changed event and expose it to the workflow.
public bool TaskComplete { get; set; }

private void OnManagerReviewTaskChanged_Invoked(object sender, ExternalDataEventArgs e) {
  TaskComplete = AfterTaskProperties.ExtendedProperties[SPBuiltInFieldId.TaskStatus].ToString() == "Completed";
}