for( auto& post : post_stream ) { …

  • My First Experience with Server-Side Swift (using Perfect from perfect.org) – Part 1

     

    Motivations

    Most of the work I take on involves languages like C++, Swift, or Objective-C and is not typically web based.  While I’m normally nestled comfortably in the embrace of C++ template meta-programming or the warm walls of an operating system kernel, even I can sense the uncivilized barbarians at the gate (Javascript) and the increasing demand that software work more closely with web based APIs and various web stacks (Django, Rails, node.js.)

    Recently, I had the interesting encounter where I was given the chance to explore the use of server-side Swift.  The client was attempting to ferry computational requests from a number of web and iOS-based clients to a very large, computational backend written in C++ (complete with its own ORM)  that was distributed across a number of nodes.  Since one of the core libraries used happened to be written in Swift, there was some hope we could simply use code on the server side without much effort.  Rather than compile a separate application and marshal requests to the Swift library through yet another framework, the decision was made to just service requests outright from a web service authored in Swift.

    Instead of authoring web services from scratch, I looked for a ready-made solution to simply embrace and extend.  There were a few different frameworks to choose from, but I eventually settled on Perfect (from http://www.perfect.org).  The decision to use a server-side swift framework, given other mature technologies, was not without internal friction.  Criticism was leveled that I was using untested “hipster” technology.  In fact, some of my colleagues insisted I go so far as to don a beret, head to the local hipster coffee shop, and offer poorly authored poetry to the public about being oppressed by self-authored technical debt!  Despite critics’ concerns, I want to highlight that I had some success with Perfect!  (Also, not enough technical debt was accumulated to warrant lobbing harangues-inappropriately-called-poetry at the public.)  Of course, the effort was not completely glitch free — and that is part of what I will document here.

    Why Perfect?

    Why Perfect?  My choice was based on a mix of inputs:  GitHub commits, looking into public issue trackers, various blogs, popularity, and out of the box features.  I watched a video on YouTube from the CEO of the organization responsible for Perfect and thought his developers’ take on things aligned with my own views.  Moreover,  in my particular environment, I needed connectors for Redis and Postgres.  The support for web sockets and the basic documentation gave me enough confidence in the product to give it a shot.  If Perfect failed me, I figured, it would fail me quickly and in an obvious enough fashion that I could bail without disastrous results.

    For our internal use case, some “jankiness” in the tech stack was tolerable provided we got results often enough between failures.  I was willing to fix bugs, provided they weren’t too deep rooted.  Basically, Perfect really needed to be “good enough” — and for our use case, it was.

    The Use Case

    Our use case:  Accept work requests from web, desktop, or mobile clients, perform some transformations and filtering on those requests, service those requests, or (in the case of computational expensive requests) dispatch those requests to a work queue (Redis).  Binaries written in C++ would then consume elements from Redis and place their results back in a database (Postgres) or, in some cases, Redis.  As the computational backend progressed on its work requests, updates were provided via WebSockets (also via Perfect) to web based clients.

    Well, did you succeed?

    I did!  In fact, the general architecture of Perfect facilitated our needs quite well.  I would say the initial prototyping and deployment of a basic service went very smoothly.  The transition from Swift 3 to Swift 4 even went smoothly — the developers responsible for Perfect handled this transition in a very timely fashion. There was no extended period in which I had clients on one version of Swift with the server on another version of Swift.

    Routes, URL Variables, Responses

    In Perfect, it is very easy to create a server, add routes, and service requests.  No particularly advanced knowledge of Swift is really required to accomplish this.  Given my experience with other frameworks, I did not encounter too much difficulty just getting up and running.  The basic process is to set up a route, pass a function parameter to handle the route, and reply using response objects.  Most of the basic web primitives for handling URL variables, posted parameters, and managing responses are provided.

    Implementing an API is simple.  Return the right response type, format your response with the proper encoding, and write the response.  I hit no major issues or bugs in performing these basic tasks.

    Encouraging!

    Using Swift on the Server Side in Linux

    Swift on the server-side does have some gotchas — this isn’t Perfect’s fault, but it affected me in that, if I developed code in Xcode and then recompiled and deployed on Linux, some small snippets of code would not compile.

    In addition, the Foundation libraries, while functionally complete, have some corner cases where things can get awkward.  One (cosmetic) thing I found somewhat unusual was the use of NSRegularExpression.  OS X and iOS developers will normally see “NS” namespace objects, but this felt slightly awkward on Linux.  Moreover, NSRegularExpression (and affiliated functions) is less pleasant to use than regex facilities in other languages.  (Obviously, this is an opinion; but I suspect many readers who’ve used other frameworks would agree with me.)

    During the transition between Swift 3 and 4, I hit some snags with certain function signatures changing.  In some cases, my code was littered with preprocessor defines like #if os(OS X) to tweak behavior on different platforms.  I didn’t love this, but it wasn’t too difficult to work around and was only a small drawback.

    Another issue I encountered was doing semi- low-level tasks — I ended up needing third party libraries to do things like atomically incrementing integers.   I found this somewhat distasteful, but not unreasonable given the nature of how Swift has been evolving and changing over the past few years.  Some primitives available on OS X were not available on Linux and suggestions on Stack Overflow seemed clunky.  I ended up using a third-party library that made effective use of features provided by Clang.  Performing some systems-software tasks in Swift still feels awkward compared to using C or C++.  This could be my own experience, but it certainly felt awkward and I’d caution others who have these same scenarios.

    Interfacing to Postgres

    I did not use the ORM (stORM) provided with Perfect.  I did, however, issue SQL requests via their Postgres package.  I think, in many languages and frameworks, authoring SQL requests is a bit of a pain.  It wasn’t any different in Swift while using Perfect.  I’m not sure there’s a way around painfully constructing strings representing complex queries, but there’s no immediately perceptible or dramatically noticeable edge in Swift for performing this task.

    One issue that did cause me to raise an eyebrow was a bug filed in the Perfect issue tracker regarding memory leaks on tables returned from the Perfect driver built on top of libpq.  My use case did not stress the Postgres driver enough for me to experience anything catastrophic with regard to using Postgres.  At the time of writing for this blog post, this issue has not been closed.  This could be of concern for long running services issuing a very large number of requests.  Hopefully the developers will address this.

    What’s Coming in Part 2

    In part 2, I’ll detail a few issues I hit in Perfect-Mustache, issues I encountered in the use of Perfect-Redis, and other architectural gotchas that I encountered.  The issues I touch upon in Part 2 have more substance.  Stay tuned!

     

  • Beat Detection: A Raspberry Pi Hacking of Hallmark’s “Happy Tappers”

    In graduate school, time series analysis was the topic I liked the most. The classic models (like AR(p), MA(q), ARMA(p,q), etc.) are used usually after time series have been sampled in some fashion. Of course, there’s more than one way to look at a time series and many of those perspectives come from the field of digital signal processing. Unfortunately, a lot of the text books on DSP are dry. One can spend hours reading a DSP book, understand the math, and still not really appreciate the material. I happen to think the appreciation for the topic comes more from trying to solve real-world problems. As problems are encountered, the motivation arises to go back and attack the mathematics in a meaningful sense. One of the more interesting problems, in my opinion, is real-time beat detection in music.  Therefore,  I decided to do some experimentation with beat detection.

    After Christmas, I went through pharmacy after-Christmas sales and picked up a bunch of cheap Hallmark ornaments. When I discovered the ornaments had a port to interface to each other, I decided I’d look at the interfacing method and devise my own schemes for controlling them. After figuring out the communications protocol between ornaments, I experimented by building control systems using FPGAs, Arduinos, and the Raspberry Pi.  I ultimately settled on the Raspberry Pi.

    With the Raspberry Pi, I was able to perform FFTs fast enough on the music samples to make a crude attempt at detecting beats while playing mp3 files. The project is still something I tinker with from time to time. I find entertainment in looking at various approaches for beat detection in research papers and documents on the internet.  Below is a sample of what I have been able to create so far, which is a hack of Hallmark’s Happy Tappers, set to a clip Capital Cities’ “Safe and Sound”:

    As precious free time becomes available, I’d like to improve the system to work with onset detection and explore various filtering problems.

    Finally, I’d like to thank the Dishoom Labs crew (MB) for letting me borrow some equipment during the project.

    (Edit:  Removed Flickr Video and replaced with YouTube.)

  • Understanding How Bitcoin Works

    Long time, no update. 🙂 It’s difficult to produce quality content, which is why I was very impressed with the person who made this particular video:

     

     

    Understanding how Bitcoin works can be difficult, but this particular video is exceptional in breaking the concepts down and connecting them together. I consider it another “must-see” YouTube video, if you have any interest at all in cryptocurrency.

  • Using Matplotlib to get XKCD-style Plots

    This past week, I found another developer blog where a post caught my attention: Pythonic Perambulations.  More specifically, it was this post regarding presenting data in an “xkcd” style.  I figured I’d give this blog post some inbound link props because I found the content both useful and amusing.  The author seems to have gotten quite a bit of well-deserved praise already in his blog’s comments section.

    Readers of xkcd are often shown very interesting plots illustrating a humorous point or interesting concept.  In any serious data analysis project, often there are many iterations of sifting and sorting through data until a some hypothesis is rejected or accepted.  Rather than put a disclaimer to warn against drawing serious conclusions from a rough-draft grade presentation, I may just start xkcdifying things.  If nothing else, xkcd-ifying plots may take the sting out of any data-driven negative implications.  🙂

     

  • C++ Reading List

    In the past, I’ve paid for various libraries (such as Anthony Williams’ excellent  just::thread library ) and C++11 documents to make an attempt to get ahead of the curve.  Now, however, C++11 information is becoming more easily accessible.  There is quite a bit of new content out there that is making things more digestible for programmers who aren’t so involved as to be following the evolution of C++ standard.

    I think this particular Dr. Dobbs article/link is technical gold for people navigating the new C++11 standard and its feature set: C++ Reading List

    A key item to note is that Bjarne Stroustrup’s book, “The C++ Programming Language”, has had its 4th edition released recently.  (It is also in the list.)

     

  • C++11 Idioms Slideshow from Silicon Valley Code Camp

    I stumbled across this very nice presentation (see below) by Sumant Tambe on C++11 idioms on Google+ and figured I’d stash a link here.

    I try to integrate as much C++11 as I can into my projects where I think there would be benefit. However, without reading the C++ programming community’s input on various programming techniques, it’s difficult to know the drawbacks of using a particular method until it’s too late (and costly). Whenever new idioms like this are presented in a clean format like in Sumant’s slideshow, I’m always eager to read about the general consensus/feedback/comments of others.  (Incidentally, Sumant’s blog is a gold mine of interesting C++ techniques; many [enjoyable] hours can be lost there.)

    Have a look for yourself:

  • C++, WebSockets, and Mt.Gox

    I’ve been reading quite a bit about websockets lately. To test my understanding of the concepts involved, I was trying to connect an application to Mt.Gox in order to collect bitcoin-related pricing information. I was having some difficulty getting all of the parameters aligned properly with all of the various websocket libraries, so I went through the process of reading other people’s code on github, sifting through stackoverflow-style-forums, and experimenting until things worked correctly. I thought I’d document some of the minor hurdles here, just to save some people out there in the wild some effort.

    I tried a few approaches:

    • Python using autobahn
    • C++ using libwebsockets (a C library)

    The Python implementation worked perfectly.  The only quirk was that I had to trick the application into inserting an “Origin:” field in the websocket handshake in order to get the application to connect.  (This had something to do with something called the Web Origin Concept.  It’s a safety mechanism for web browsers.)  After tweaking the code to force an ‘Origin’ using the autobahn library, the handshake with the server ended up looking something like this:

    GET /mtgox HTTP/1.1
    User-Agent: disease
    Origin: websocket.mtgox.com:80
    Host: websocket.mtgox.com:80
    Upgrade: WebSocket
    Connection: Upgrade
    Pragma: no-cache
    Cache-Control: no-cache
    Sec-WebSocket-Key: JI2o83NARHJDps1XCN7eCQ==
    Sec-WebSocket-Version: 13

    After a handshake of this style, data starts streaming over the connection with no issue. With a working Python version, I had enough knowledge to craft an appropriate handshake in the C++ version of my bitcoin application.

    With libwebsockets, it took some tinkering with the source code to get things to work right with Mt.Gox. The extensions field in the ‘lws_context_creation_info’ should not be filled in, and in the connection phase in libwebsocket_client_connect(), the specified protocol should be NULL. The origin and host strings fed into the libwebsocket_client_connect() function should mirror what you see above in the successful python handshake.

    I didn’t bother to post code here, but the general idea to get things to start streaming correctly is to have a handshake that looks something like what I’ve copied above for the python implementation I used.

    I hope this helps the random Googler out there searching for information.

  • Matonis: “Fincen’s New Regulations Are Choking Bitcoin Entrepreneurs”

    I found this informative article written by Jon Matonis detailing the current hurdles in the way of mass adoption of crypto-currencies within the US.  Technology enthusiasts (like me) tend to dive headfirst into the “how things work” angle of emerging trends like bitcoin, but much of what determines the fate of new technology lands in other domains such as government and public policy.

    Matonis talks about past efforts (PayPal, E-Gold) and moves into descriptions of how current bitcoin businesses are navigating the landscape.  I thought the content presented was an excellent summary and worth the read.  Here’s a snippet from the article:

    In a recent speech, Fincen Director Jennifer Shasky Calvery said the new guidance aims “to protect [digitial currency] systems from abuse and to aid law enforcement in ensuring that they are getting the leads and information they need to prosecute the criminal actors.” She reiterated that the guidance does not apply to everyday users who pay or accept bitcoin for goods and services.

    But by saddling startups with compliance requirements, and making them unattractive clients for regulated banks that despair of serving MSBs, Fincen is choking these businesses that facilitate conversion of bitcoins into dollars. Fewer exchanges and more red tape will make it harder for merchants or consumers (who, after all, must still pay the bills with dollars) to take advantage of the Bitcoin payment system’s speed, privacy and competitive costs.

     

  • Pitfalls of Bitcoin Ownership: Disk Failure

    I was in the middle of browsing the web when suddenly my system stopped responding to me.  The mouse was still moving, but none of the windows were registering actions.  My system froze and locked up on me.   After cycling the power, I noticed the kernel message log spewing out all kinds of errors related to the disk drive that housed my “/home” partition.  My disk was failing.  Normally, I’d just throw a new disk in, recover from backups, clone the appropriate git repos, and proceed.  However, this time was slightly different because  my bitcoin wallet was on the failing drive and wasn’t backed up yet.

    Thankfully, the value of my bitcoins doesn’t amount to much more than a few dollars.  Nevertheless, had I been reliant on bitcoins for substantial financial purchases, this could have been a disaster scenario.  There is much talk on internet forums about people losing thousands of dollars worth of bitcoins due to hardware failure.  Even though there was more than adequate warning to pursue backups,  I didn’t consider that I’d be confronted with such a scenario so early on in my experience with crypto-currency.  (I sort of figured the dollar value of my holdings wasn’t substantial enough to care, and I had only had the coins for a week or two before the disk failed.)  However, after the disk failed, I was determined to get my coins back and I managed to succeed.

    This post details roughly what I did to recover my wallet file on a linux-kernel based operating system with an ext4 filesystem.  Unfortunately I don’t have screenshots and the process I describe below may be somewhat imprecise.  However, my post might serve as a rough outline of how to retrieve your wallet in the event you start seeing evidence of disk failure.  Please note that I can’t make safety/success guarantees about these steps and they may not work for you (in fact, they may even cause damage.)  If you have substantial bitcoin value at risk, you may want to consider reading a few different sites before trying these steps or, better yet, find a data recovery specialist or expert.

    In my situation, I use an ext4 file system.  Because my system was for general purpose use, I did not have the /home partition encrypted.  When the disk stopped operating, I took the following procedures to recover my bitcoins:

    1. I downloaded an Ubuntu 12.04 LiveCD
    2. After booting from the CD, I clicked “Try Ubuntu”
    3. I loaded FireFox and downloaded TestDisk  and unpacked the files.
    4. After launching XTerm, I ran TestDisk (as root/sudo’d).  I then navigated to the partition I knew was my ext4 partition on the disk device I knew was failing (in my case, a partition on /dev/sdb).  I then used the file system utilities  menu (“Filesystem Utils”) and tagged the partition (via the [T]ype operation in TestDisk) as an LVM->Ext4 partition.  Before using the [T]ype operation, TestDisk showed the ext4 partition as just “MS Data”.  Specifying the type made TestDisk “see” the partition correctly.
    5. I tried to list the files in TestDisk (I think it was the “L” command); this failed.  Don’t panic (yet.)
    6. Since listing the files failed, I used the TestDisk superblock option to get the appropriate disk information from the backup superblocks on the filesystem.  TestDisk gives some fsck.ext4 command to attempt to repair the filesystem, but I ignored it and just went back to the menu and retried the “list” operation
    7. Once the files started listing, I navigated to the /home/<username>/.bitcoin directory, and then I copied the wallet.dat to a USB memory stick that I had inserted.

    Once the wallet.dat was recovered and the disk drive was replaced, I recovered /home from my usual backups and I placed the file into a new .bitcoin directory on my system.  ‘bitcoin-qt’ (my wallet app) accepted the file and successfully recovered all of my bitcoins.  It began synchronizing with the p2p network.

    To prevent future scenarios like this, I configured RAID and put a script in place to routinely backup the wallet file.  For the record, TestDisk is amazing software.  I just hope I never have to use it again.  🙂

    Good luck to those who are ever in this situation.  I hope the steps above help you, or at least point you in the direction of the right tools to recover your coins.

     

  • Merkle Trees

    The paper detailing the mechanics of Bitcoin is a fascinating read. However, the technology involved requires several re-readings of the paper to make sense of what is going on.  I’ve been attacking the concepts piece by piece as time permits. In particular, the Bitcoin paper’s comments regarding the use of the Merkle Tree caught my eye. I had never encountered the concept of a Merkle Tree before in any applications I’d worked with, so I figured some investigation would be worthwhile.

    Within Bitcoin, the Merkle Tree is a disk-space saving mechanism.   From the Bitcoin paper:

    ” Once the latest transaction in a coin is buried under enough blocks, the spent transactions before it can be discarded to save disk space. To facilitate this without breaking the block’s hash, transactions are hashed in a Merkle Tree […], with only the root included in the block’s hash. Old blocks can then be compacted by stubbing off branches of the tree. The interior hashes do not need to be stored.

    I did some digging into the Merkle Tree and read some of the implementation code to understand it. I found that the Wikipedia page does a pretty good job summarizing the data structure:

    “In cryptography and computer science a hash tree or Merkle tree is a tree in which every non-leaf node is labelled with the hash of the labels of its children nodes. Hash trees are useful because they allow efficient and secure verification of the contents of larger data structures. Hash trees are a generalisation of hash lists and hash chains. To demonstrate that a leaf node is part of a given hash tree requires an amount of data proportional to the log of the number of nodes of the tree. (This contrasts with hash lists, where the amount is proportional to the number of nodes.) The concept is named after Ralph Merkle.”

    The bitcoin source code helps make some of the concepts clearer.  Within the code, the CBlockHeader shows a member variable ‘uint256 hashMerkleRoot’ that refers to the concept mentioned in the paper:

    class CBlockHeader
    {
    public:
        // header
        static const int CURRENT_VERSION=2;
        int nVersion;
        uint256 hashPrevBlock;
        uint256 hashMerkleRoot;
        unsigned int nTime;
        unsigned int nBits;
        unsigned int nNonce;
        ...

    And, as far as the actual construction of the Merkle Tree:

    uint256 BuildMerkleTree() const
    {
        vMerkleTree.clear();
        BOOST_FOREACH(const CTransaction& tx, vtx)
            vMerkleTree.push_back(tx.GetHash());
        int j = 0;
        for (int nSize = vtx.size(); nSize > 1; 
             nSize = (nSize + 1) / 2)
        {
            for (int i = 0; i < nSize; i += 2)
            {
                int i2 = std::min(i+1, nSize-1);
                vMerkleTree.push_back(
                    Hash( BEGIN(vMerkleTree[j+i]),
                          END(vMerkleTree[j+i]),
                          BEGIN(vMerkleTree[j+i2]), 
                          END(vMerkleTree[j+i2])));
            }
            j += nSize;
        }
        return (vMerkleTree.empty() ? 0 : vMerkleTree.back());
    }
    

    The first thing to note is that the Hash() function mentioned in this code is actually a SHA256 double hash. (You can look into the code for yourself to validate this.)

    The code appears to iterate over a vector of transactions and pushes their hashes into a vector representing the nodes of the Merkle tree. It then loops in such a manner that it appears to hash adjacent transaction hashes together (if an adjacent node is available — if one isn’t, the node is concatenated with itself). So if there are 3 transactions (nSize=3), [1, 2, 3]. The first pass of the outer loop, nSize = 3, you get an inner loop where there’s a hash of the hash computed over the hashes of element 1 and element 2 (we’ll call it dhash[1,2]), then a hash of the hash of element 3 concatenated with itself (dhash[3,3]). The variable j increments forward, so that on the next loop iteration, the double hash of dhash[1,2] and dhash[3,3] (both from the previous iteration) is computed, etc., etc. and the process continues until the tree gets built within a vector. The last node (vMerkelTree.back()) ends up being the root hash.

    When I queried some people on various forums as to the necessity of the double hash as opposed to a single hash, they claimed it was desirable as a means of increasing the cost of calculation within the bitcoin framework.

    Interesting, right? I hope to look more into the transaction mechanisms as I dig further into the system. Nevertheless, the exercise of going through the code to construct the Merkle Tree was educational and enlightening.