PowerShell–Bug Fixing and Assembly Patching for Guts and Glory

I’ve got a little break from the grind at the moment in work, being on what they call “The Bench.” This is a hockey reference of course (obvious to all fellow Canadians, I’d assume); the relative calm before the storm of the next project. Taking a break from my usual activities, I decided to take the time to look around for something painful, pointless and fun to do with PowerShell. Hah, who am I kidding. This is my favourite thing to do. Anyway, there was a bit of discussion on our private MVP mailing list concerning some weird behaviour of one of the new cmdlets in PowerShell 3.0: Invoke-RestMethod. It turns out it has a bug. Well, it appears to have about a dozen right now, but I chose this one to look at since it was the one we were discussing.

The Invoke-RestMethod Bug

After I’d done the work for this blog, I went to log the bug but someone had already done it. The "invoke-restmethod-is-truncating-results" bug was logged by Trevor Sullivan, a well known PowerShell enthusiast within our community. He noticed that when it’s used to grab an RSS or ATOM feed, it seems to give you back only about half of the items you want, and it’s not the first half either. I fired up Reflector (I swear if program icons could be worn down, this one would be a shiny nub at this point) and quickly isolated the method in question. I know this cmdlet name has a generic sounding noun indicating that it is used primarily for REST calls, but it actually gives special treatment to data that appears to be a syndication feed. This is presumed to cater to the most common use cases, according to Microsoft. The reconstituted method that does the work looks like this:

private bool TryProcessFeedStream(
    BufferingStreamReader responseStream)
{
    bool flag = false;
    try
    {
        XmlReaderSettings secureXmlReaderSettings =
            this.GetSecureXmlReaderSettings();

        XmlReader reader = XmlReader.Create(responseStream,
            secureXmlReaderSettings);

        for (int i = 0; (i < 10) && reader.Read(); i++)
        {
            if (string.Equals("rss", reader.Name,
                    StringComparison.OrdinalIgnoreCase) ||
                string.Equals("feed", reader.Name,
                    StringComparison.OrdinalIgnoreCase))
            {
                flag = true;
                break;
            }
        }
        if (flag)
        {
            XmlDocument document = new XmlDocument();
            while (reader.Read())
            {
                if ((reader.NodeType == XmlNodeType.Element) &&
                   (string.Equals("Item", reader.Name,
                       StringComparison.OrdinalIgnoreCase) ||
                    string.Equals("Entry", reader.Name,
                       StringComparison.OrdinalIgnoreCase)))
                {
                    XmlNode sendToPipeline =
                        document.ReadNode(reader);

                    base.WriteObject(sendToPipeline);
                }
            }
        }
        return flag;
    }
    catch (XmlException)
    {
    }
    finally
    {
        responseStream.Seek(0L, SeekOrigin.Begin);
    }
    return flag;
}

Analysis

The method itself looks pretty simple. It scans through the first ten nodes looking for a "feed” or “rss” element. If it finds this, it creates a new empty XML document instance and scans the rest of the input looking for “entry” or “item” elements, the meat of an ATOM or RSS feed respectively. If it finds one, it clones it into the holding document, and loops around the while condition, which advances the reader, one node at a time until it returns false, terminating the loop.

I hadn’t read Trevor’s report at this time, so the talk on the list said that the items coming appeared to be pretty random. Now, faced with the simplicity of the processing code, I figured it was just skipping nodes. I suspected a double-read call going on somewhere and the document.ReadNode line was my primary suspect; a quick look in Reflector at the ReadNode implementation confirmed this. In order to validate the bug and the subsequent fix, I decided to mock up the method in PowerShell script so I could repro the behavior:

function bugbug {
    # returns last TEN items from the powershell team blog
    $req = [System.Net.WebRequest]::Create(
        "http://blogs.msdn.com/powershell/rss.aspx")
    $res = $req.GetResponse()
    $stream = $res.GetResponseStream()

    $flag = $false

    $xmlsettings = [System.Xml.XmlReaderSettings] @{
        CheckCharacters = $false;
        CloseInput = $false;
        IgnoreProcessingInstructions = $true;
        MaxCharactersFromEntities = 0x400L;
        DtdProcessing = "Ignore"
    }

    $reader = [System.Xml.XmlReader]::Create($stream, $xmlsettings) 

    for ($i = 0; ($i -lt 10) -and $reader.Read(); $i++) {
        if (@("rss", "feed") -icontains $reader.Name) {
            $flag = $true
            break;
        }
    }

    if ($flag) {
        $doc = new-object System.Xml.XmlDocument
        while ($reader.Read()) {
            if (($reader.NodeType -eq "Element") -and
                (@("item", "entry") -icontains $reader.Name)) {
                $node = $doc.ReadNode($reader)
            }
        }
    }

    if ($res) { $res.Dispose() }
}

The Fix

So, knowing now that the ReadNode method will advance the reader one node as a side effect (technically, it will not advance the reader if the context node is an attribute, but that’s not our case here,) the problem is clear. We should loop until the reader is at the end, and when processing a node we should either call ReadNode to clone it, or call Read on the reader to try the next node.  With this in mind, the fixed while loop – modeled in script – looks like this:

# loop until EOF
while ($reader.ReadState -ne "EndOfFile") {
    if (($reader.NodeType -eq "Element") -and `
        (@("item", "entry") -icontains $reader.Name)) {
        # copy node
        $node = $doc.ReadNode($reader)
    } else {
        # next
        [void]$reader.Read()
    }
}

Ok, great. We’ve figured out what the code should have been, or at least one apparently working variant of it. So given that I prequalified this adventure with my masochistic intentions, let’s set about fixing the bug.

Reflexil

Many of you will be familiar with Lutz Roeder’s excellent Reflector utility, which allows anyone to glance inside any .NET assemblies to discover how they are implemented. Indeed, this is how I diagnosed the bug as explained above. One  of the cool things about Reflector is that it allows third party extensions that can hook into the engine to provide additional functionality. One of these such extensions is the most excellent Reflexil, which leverages JB Evain’s Mono.Cecil library to allow “editing” of assemblies.

image

Now, when I say “editing,” I am probably making it sound a lot less fiddly than it really is for most cases. While Reflexil will let you compile inline C# to edit or augment an assembly, the inline source must essentially be “stand alone” in that it cannot reference things beyond its scope or visibility. For example, you can’t make reference to private or protected members of the Type you are augmenting or editing. So, what I ended up doing was writing a partial implementation of the method I wanted to patch and then using Reflector to display the IL (intermediate language.) I then dumped the IL of the faulty method, and pasted both streams into an Excel spreadsheet so I could figure out what parts of the method I had to edit in Reflexil. My first attempt had me absent mindedly compile up my replacement method as Debug build, and the IL ended up being so wildly different that it would have taken me much longer than necessary to patch the assembly. Oops!

image

A bit more tinkering with compile options and targeting the Release configuration yielded me with a corrected IL dump that only differed by about fifteen instructions.

Final Steps

So after finally saving the modified assembly, Microsoft.PowerShell.Commands.Utility.Patched.dll, to disk, I am ready to load it into PowerShell. Now, if you’re one of the people who is still awake by this point, you’re probably thinking: “Well, that ain’t gonna work. You broke the strong name signature when you edited the assembly.” Indeed I did, but by using the SDK’s SN.EXE strong name tool, I disabled strong name verification for this assembly. This allows me to load the assembly and the CLR with skip the verification steps for the strong name, ultimately letting me replace the broken Invoke-Restmethod with a working one. Et Voila: Invoke-RestMethod is returning all items from the RSS feed, like it should do:

image

As an aside, another approach would have been to edit the assembly in the GAC in-situ, as the CLR will not verify strong named assemblies that are already in the GAC. Verification only happens at install time; this is an optimization for the loader that was added some time ago.

So, yeah, I could have just written a function with the same name to replace the Cmdlet, letting command precedence do the hard work, but that’s a bit boring, right?

Have fun!

About the author

Irish, PowerShell MVP, .NET/ASP.NET/SharePoint Developer, Budding Architect. Developer. Montrealer. Opinionated. Montreal, Quebec.

Month List

Page List