<?xml version="1.0" encoding="utf-8"?>
<feed xmlns="http://www.w3.org/2005/Atom">

  <title><![CDATA[The Lapidary Lemur]]></title>
  <link href="http://www.baweaver.com/atom.xml" rel="self"/>
  <link href="http://www.baweaver.com/"/>
  <updated>2015-10-10T20:16:36-07:00</updated>
  <id>http://www.baweaver.com/</id>
  <author>
    <name><![CDATA[Brandon Weaver]]></name>
    
  </author>
  <generator uri="http://octopress.org/">Octopress</generator>

  
  <entry>
    <title type="html"><![CDATA[You Type Too Much]]></title>
    <link href="http://www.baweaver.com/blog/2015/10/10/you-type-too-much/"/>
    <updated>2015-10-10T17:44:58-07:00</updated>
    <id>http://www.baweaver.com/blog/2015/10/10/you-type-too-much</id>
    <content type="html"><![CDATA[<p>You type too much. Whether it&rsquo;s in the command line, your editor, repeating the same code patterns, or whatever else it all comes down to one thing: you type too much, and I&rsquo;m here to help fix that.</p>

<p>The irony here is that this is a long article in which I most certainly type too much. This is far more an overview article than anything, and there will be followups detailing the covered sections at a later date.</p>

<!-- more -->


<p>Noted that I&rsquo;ll try and include books I&rsquo;ve read on some of the below that I&rsquo;ve found handy. Know of another? Leave a comment!</p>

<h1>Learn your Shell</h1>

<p><a href="http://linuxcommand.org/tlcl.php">http://linuxcommand.org/tlcl.php</a></p>

<p>Chances are high you&rsquo;ve been repeating a lot of commands on your shell, especially around git and history based items. After a while all of those characters can add up, time to cut them down to size.</p>

<p>Noted that I mentioned some of this in an earlier article: <a href="http://baweaver.dev/blog/2013/09/29/getting-cozy-with-the-command-line/">http://baweaver.dev/blog/2013/09/29/getting-cozy-with-the-command-line/</a></p>

<p>If you notice yourself typing something more than once, or typing a string of commands you tend to forget, it&rsquo;s time to break out some shell scripting and knock things down to size.</p>

<h2>ZSH</h2>

<p>If you haven&rsquo;t checked it out yet, ZSH is loaded with aliases and extra power from the start. I won&rsquo;t cover the list of features, but the ones we should be concerned with are (and some exist in BASH):</p>

<ul>
<li>Aliasing</li>
<li>Tab Completion</li>
<li>History</li>
<li>Globbing</li>
</ul>


<h3>Aliasing</h3>

<p>(note: Oh-My-ZSH has a lot of this built in: <a href="https://github.com/robbyrussell/oh-my-zsh">https://github.com/robbyrussell/oh-my-zsh</a>)</p>

<p>How often do you find yourself typing in <code>git add</code> or <code>git commit -m</code> or other items? You can alias those into a few characters, saving a lot of typing:</p>

<figure class='code'><figcaption><span></span></figcaption><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
<span class='line-number'>3</span>
<span class='line-number'>4</span>
</pre></td><td class='code'><pre><code class='sh'><span class='line'><span class="nb">alias </span><span class="nv">g</span><span class="o">=</span><span class="s1">&#39;git&#39;</span>
</span><span class='line'><span class="nb">alias </span><span class="nv">gcm</span><span class="o">=</span><span class="s1">&#39;git commit -m&#39;</span>
</span><span class='line'><span class="nb">alias </span><span class="nv">gcb</span><span class="o">=</span><span class="s1">&#39;git checkout -b&#39;</span>
</span><span class='line'><span class="nb">alias </span><span class="nv">gpo</span><span class="o">=</span><span class="s1">&#39;git push origin&#39;</span>
</span></code></pre></td></tr></table></div></figure>


<p>Now what&rsquo;s the difference here between that and Bash? ZSH supports global aliases which can be used anywhere in a command. Let&rsquo;s say you want to keep a log of a statement:</p>

<figure class='code'><figcaption><span></span></figcaption><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
</pre></td><td class='code'><pre><code class='sh'><span class='line'><span class="nb">alias</span> -g <span class="nv">LOG</span><span class="o">=</span><span class="s2">&quot;| tee -a ~/log.txt&quot;</span>
</span></code></pre></td></tr></table></div></figure>


<h3>Functions</h3>

<p>Functions, much like any other language, can be used to combine actions. Sometimes a quick function in your shell is all you need.</p>

<p>How about getting your current branch name for a commit message?</p>

<figure class='code'><figcaption><span></span></figcaption><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
<span class='line-number'>3</span>
<span class='line-number'>4</span>
<span class='line-number'>5</span>
<span class='line-number'>6</span>
<span class='line-number'>7</span>
<span class='line-number'>8</span>
<span class='line-number'>9</span>
<span class='line-number'>10</span>
<span class='line-number'>11</span>
</pre></td><td class='code'><pre><code class='sh'><span class='line'><span class="k">function</span> branch_name<span class="o">()</span> <span class="o">{</span> git rev-parse --abbrev-ref HEAD <span class="o">}</span>
</span><span class='line'>
</span><span class='line'>git push origin <span class="sb">`</span>branch_name<span class="sb">`</span>
</span><span class='line'>
</span><span class='line'><span class="c"># Though you could also just:</span>
</span><span class='line'>git push origin HEAD
</span><span class='line'>
</span><span class='line'><span class="c"># Though what if you want your commit messages prefixed with your task?</span>
</span><span class='line'><span class="c">#</span>
</span><span class='line'><span class="c"># ex: ABC-123-my-branch-name</span>
</span><span class='line'><span class="k">function</span> branch_prefix<span class="o">()</span> <span class="o">{</span> branch_name <span class="p">|</span> cut -d<span class="s1">&#39;-&#39;</span> -f1,2 <span class="o">}</span>
</span></code></pre></td></tr></table></div></figure>


<p>I tend to use this a lot for grep, less, and other common shell functions in my workflow. It&rsquo;s really handy when it&rsquo;s some heinous AWK or SED line I don&rsquo;t want to remember.</p>

<h3>Tab Completion</h3>

<p>While this works in Bash with some extensions, it comes built into ZSH. Even more comes up when you have Oh-My-ZSH which uses Compleat: <a href="https://github.com/mbrubeck/compleat">https://github.com/mbrubeck/compleat</a></p>

<p>If you&rsquo;re like me and you prefix your git branches with tags, you can autocomplete against that if you happen to misplace a branch.</p>

<h1>Learn your Editor</h1>

<p>Your editor has features designed to save time as well. While autocompletion comes to mind, I personally find it tedious and not nearly as powerful as other features such as snippets and macros.</p>

<h2>Sublime</h2>

<p><a href="https://sublimetextbook.com/">https://sublimetextbook.com/</a></p>

<p>The first considerations in sublime should be those mentioned on the front page, such as column selection and multi-select for words. It&rsquo;s worth it to read the documentation there as there are several features that will be immediately usable.</p>

<h3>Plugins</h3>

<p>There&rsquo;s a sublime plugin for pretty well everything, including that documentation you hate to write by hand and can never remember the syntax for.</p>

<h3>Snippets</h3>

<p>Sublime comes built with the concept of snippets, letting you define blocks of code with interpolatable tags:</p>

<figure class='code'><figcaption><span></span></figcaption><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
<span class='line-number'>3</span>
</pre></td><td class='code'><pre><code class='sh'><span class='line'><span class="k">function</span> <span class="k">${</span><span class="nv">1</span><span class="p">:</span><span class="nv">myFunction</span><span class="k">}</span> <span class="o">(</span><span class="k">${</span><span class="nv">2</span><span class="p">:</span><span class="nv">args</span><span class="k">}</span><span class="o">)</span> <span class="o">{</span>
</span><span class='line'>  <span class="k">${</span><span class="nv">3</span><span class="p">:</span><span class="nv">return</span><span class="p">;</span><span class="k">}</span>
</span><span class='line'><span class="o">}</span>
</span></code></pre></td></tr></table></div></figure>


<p>These can be bound to language specific contexts, preventing overlaps for potentially the same names (jasmine vs rspec snippets anyone?)</p>

<h3>Macros</h3>

<p><a href="http://docs.sublimetext.info/en/latest/extensibility/macros.html">http://docs.sublimetext.info/en/latest/extensibility/macros.html</a></p>

<p>These are going to look very familiar if you&rsquo;ve been using vim, at least in terms of key commands.</p>

<p>A macro is a series of actions that can be replayed at a later time, even bound to a key combination.</p>

<p>Catch yourself correcting 4 space indentation to 2 space? You can macro that!</p>

<p>An evil left-bracer got a hold of your files? You can macro that!</p>

<p>Someone is writing Java on your team and you want to get rid of it for Scala? You can macro that one too, but a bit more hackery and a priest to contain the evil during the exorcism will be required.</p>

<h2>Vim</h2>

<p><a href="https://pragprog.com/book/dnvim/practical-vim">https://pragprog.com/book/dnvim/practical-vim</a></p>

<p>Much of the same general features in Sublime are available in Vim with some extension, including substantially more powerful macro and snippets features. Sublime just happens to come pre-baked with simpler sane defaults.</p>

<h3>Shell out</h3>

<p>Vim can use the command system to execute whatever you want from the shell, learning to use this will be of extreme benefit. You can even go as far as having your own scripts directory for generating more code on the fly.</p>

<h3>Ulti-Snips</h3>

<p><a href="https://github.com/SirVer/ultisnips">https://github.com/SirVer/ultisnips</a></p>

<p>You remember how sublime snippets can interpolate values for you? Ultisnips takes it a step further by taking those interpolated values and using them to generate more.</p>

<p>Why is this useful? Think initializers and documentation skeletons. Learn a bit of Python and you can be off with substantially more dynamic snippets.</p>

<h2>What about x IDE?</h2>

<p>IDEs are designed to cater to a wide base, and more times than not I find that assumption to make it very annoying to work with. Instead, editors like Vim and Emacs allow me to build things from the ground up, catered specifically to my style of programming.</p>

<p>It should come as little surprise that you tend to remember shortcuts that you yourself make as opposed to memorizing a list of commands and keyboard shortcuts.</p>

<p>Then why do I use Sublime on occasion you may ask? Sublime is far less likely to cause someone to incite physical violence against my person when pair programming than a modded-out instance of Vim with remapped keys everywhere.</p>

<p>That, and Sublime is quite frankly a much better editor for people starting out.</p>

<h3>But emacs!</h3>

<p>My religious preferences (vim) prevent me from giving credence to such an eVil editor :P. On a serious note, never saw much of a reason to bother with it as I already knew Vim from SysAdmin work.</p>

<h1>Learn to recognize unabstractable duplication</h1>

<p>There are a few frameworks out there that have a concept of generators. Two of them are Yeoman for NodeJS and Generators for Rails.</p>

<p>Creating generators that are catered to the style of your team can greatly reduce the time and mistakes made when implementing a new section of code. Even if that code can only accurately generate up to 70% of your shippable code, that&rsquo;s 70% you know works and has passed style inspections and the like.</p>

<h2>Yeoman Generators</h2>

<p><a href="http://yeoman.io/authoring/">http://yeoman.io/authoring/</a></p>

<p>Yeoman, it seems, has a bit of a sense of humor in that they&rsquo;ve gone and made a generator-generator to help you make more generators. A bit meta, but quite useful in getting started.</p>

<p>Yeomen generators come with options for making prompts much like a wizard, and the nice thing is that they remember your last responses as the new default options.</p>

<p>Say you like a certain generator but need to get more done, just compose it with another generator to get them to run in tandem.</p>

<p>You could even tie them into your editor using custom made adapters. As long as you respond to IO properly, Yeoman can take care of the rest for you.</p>

<h2>Rails Generator</h2>

<p>A lot of people I know in the Rails world complain about the viability of scaffolded code, saying it comes no where close to what they intended. Well handily enough you can customize every step of the scaffold process, including the style of models, controllers, and views they generate.</p>

<p>Say you just want a simple searchable and sortable bootstrap table with CRUD operations, maybe make that an Angular or React view? You can do that with generators with some customization. You can even generate the RSpec and Jasmine for it while you&rsquo;re at it.</p>

<p>Don&rsquo;t underestimate Rails Generators because they&rsquo;re abused by newer coders. Thumbing your nose at it is a serious mistake.</p>

<h1>Learn to let your code speak for itself</h1>

<p>Of course you could make generators for all of your code, define the perfect styles and agree on everything, but what if something else could generate it already? Seems like a waste to ignore.</p>

<p>Fortunately there&rsquo;s such a concept of generating services code for RESTful APIs, present in a number of frameworks:</p>

<ul>
<li>RAML - <a href="http://raml.org/">http://raml.org/</a></li>
<li>API Blueprint - <a href="https://apiblueprint.org/">https://apiblueprint.org/</a></li>
<li>Swagger - <a href="http://swagger.io/">http://swagger.io/</a></li>
</ul>


<p>In most of these you can treat your API definitions as generators for your client service code. Imagine not having to write services in Angular or other frontend frameworks.</p>

<p>Now if you&rsquo;re clever, you could even tie this into an inline-api tool like ApiPie to create things in line with your actual API methods giving you both documentation and workable client code at the same time.</p>

<h1>Finishing up</h1>

<p>The main thing in reducing typing is to never be content running a long string of tasks. Always be looking for areas that can be improved, reduced, or eliminated altogether. Most of this same mentality is already applied to our code, so why not the process leading up to and around our code as well?</p>
]]></content>
  </entry>
  
  <entry>
    <title type="html"><![CDATA[Association Aggregates Explained]]></title>
    <link href="http://www.baweaver.com/blog/2015/09/28/association-aggregates-explained/"/>
    <updated>2015-09-28T19:52:29-07:00</updated>
    <id>http://www.baweaver.com/blog/2015/09/28/association-aggregates-explained</id>
    <content type="html"><![CDATA[<p>The last post covered some of the basics of aggregate commands, but left out a section explaining the more perilous aggregates of associations and more advanced querying against them.</p>

<p>Here are a few questions to get you thinking before we start. Given a model Foo which <code>has_many</code> Tags <code>(key, value, foo_id)</code>:</p>

<ul>
<li>How do we find the count of tags for every Foo?</li>
<li>How do we find a Foo with multiple matching tags? (name: &lsquo;David Tennant&rsquo; AND color: &lsquo;Blue&rsquo;)</li>
</ul>


<p>Suddenly ActiveRecord becomes very annoyingly complicated to use, but not to fear! We can still use SQL for all of this.</p>

<!-- more -->


<p>So now our application looks something like this:</p>

<figure class='code'><figcaption><span></span></figcaption><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
<span class='line-number'>3</span>
<span class='line-number'>4</span>
<span class='line-number'>5</span>
<span class='line-number'>6</span>
<span class='line-number'>7</span>
<span class='line-number'>8</span>
<span class='line-number'>9</span>
<span class='line-number'>10</span>
<span class='line-number'>11</span>
<span class='line-number'>12</span>
<span class='line-number'>13</span>
<span class='line-number'>14</span>
<span class='line-number'>15</span>
<span class='line-number'>16</span>
<span class='line-number'>17</span>
<span class='line-number'>18</span>
<span class='line-number'>19</span>
<span class='line-number'>20</span>
<span class='line-number'>21</span>
<span class='line-number'>22</span>
<span class='line-number'>23</span>
<span class='line-number'>24</span>
<span class='line-number'>25</span>
<span class='line-number'>26</span>
<span class='line-number'>27</span>
<span class='line-number'>28</span>
<span class='line-number'>29</span>
<span class='line-number'>30</span>
<span class='line-number'>31</span>
<span class='line-number'>32</span>
<span class='line-number'>33</span>
<span class='line-number'>34</span>
<span class='line-number'>35</span>
</pre></td><td class='code'><pre><code class='ruby'><span class='line'><span class="c1"># Foo: {a: String, b: String, c: String}</span>
</span><span class='line'><span class="c1"># Tag: {key: String, value: Text, foo_id: Integer}</span>
</span><span class='line'><span class="c1"># Foo has_many Tags and Tag belongs_to a Foo</span>
</span><span class='line'>
</span><span class='line'><span class="c1"># Seeds</span>
</span><span class='line'>
</span><span class='line'><span class="n">words</span>   <span class="o">=</span> <span class="no">IO</span><span class="o">.</span><span class="n">readlines</span><span class="p">(</span><span class="s1">&#39;/usr/share/dict/words&#39;</span><span class="p">)</span><span class="o">.</span><span class="n">flat_map</span> <span class="p">{</span> <span class="o">|</span><span class="n">w</span><span class="o">|</span> <span class="n">w</span><span class="o">.</span><span class="n">chomp</span><span class="o">.</span><span class="n">downcase</span> <span class="p">}</span>
</span><span class='line'><span class="n">records</span> <span class="o">=</span> <span class="mi">100</span><span class="o">.</span><span class="n">times</span><span class="o">.</span><span class="n">map</span> <span class="p">{</span> <span class="o">|</span><span class="n">i</span><span class="o">|</span> <span class="p">{</span><span class="ss">a</span><span class="p">:</span> <span class="n">words</span><span class="o">.</span><span class="n">sample</span><span class="p">,</span> <span class="ss">b</span><span class="p">:</span> <span class="n">words</span><span class="o">.</span><span class="n">sample</span><span class="p">,</span> <span class="ss">c</span><span class="p">:</span> <span class="n">words</span><span class="o">.</span><span class="n">sample</span><span class="p">}</span> <span class="p">}</span>
</span><span class='line'><span class="n">foos</span>    <span class="o">=</span> <span class="no">Foo</span><span class="o">.</span><span class="n">create</span><span class="p">(</span><span class="n">records</span><span class="p">)</span>
</span><span class='line'>
</span><span class='line'><span class="n">tag_seeds</span> <span class="o">=</span> <span class="p">{</span>
</span><span class='line'>  <span class="nb">name</span><span class="p">:</span>  <span class="o">[</span>
</span><span class='line'>    <span class="s1">&#39;William Hartnell&#39;</span><span class="p">,</span>
</span><span class='line'>    <span class="s1">&#39;Patrick Troughton&#39;</span><span class="p">,</span>
</span><span class='line'>    <span class="s1">&#39;Jon Pertwee&#39;</span><span class="p">,</span>
</span><span class='line'>    <span class="s1">&#39;Tom Baker&#39;</span><span class="p">,</span>
</span><span class='line'>    <span class="s1">&#39;Peter Davison&#39;</span><span class="p">,</span>
</span><span class='line'>    <span class="s1">&#39;Colin Baker&#39;</span><span class="p">,</span>
</span><span class='line'>    <span class="s1">&#39;Sylvester McCoy&#39;</span><span class="p">,</span>
</span><span class='line'>    <span class="s1">&#39;Paul McGann&#39;</span><span class="p">,</span>
</span><span class='line'>    <span class="s1">&#39;Chris Eccleston&#39;</span><span class="p">,</span>
</span><span class='line'>    <span class="s1">&#39;David Tennant&#39;</span><span class="p">,</span>
</span><span class='line'>    <span class="s1">&#39;Matt Smith&#39;</span><span class="p">,</span>
</span><span class='line'>    <span class="s1">&#39;Peter Capaldi&#39;</span>
</span><span class='line'>  <span class="o">]</span><span class="p">,</span>
</span><span class='line'>  <span class="ss">place</span><span class="p">:</span> <span class="sx">%w(Tardis Gallifrey Kasterborous Earth)</span><span class="p">,</span>
</span><span class='line'>  <span class="ss">color</span><span class="p">:</span> <span class="sx">%w(Red Blue Yellow Green Black White Orange)</span>
</span><span class='line'><span class="p">}</span>
</span><span class='line'>
</span><span class='line'><span class="n">foos</span><span class="o">.</span><span class="n">each</span> <span class="p">{</span> <span class="o">|</span><span class="n">foo</span><span class="o">|</span>
</span><span class='line'>  <span class="n">tag_seeds</span><span class="o">.</span><span class="n">each</span> <span class="p">{</span> <span class="o">|</span><span class="n">key</span><span class="p">,</span> <span class="n">values</span><span class="o">|</span>
</span><span class='line'>    <span class="n">foo</span><span class="o">.</span><span class="n">tags</span> <span class="o">&lt;&lt;</span> <span class="no">Tag</span><span class="o">.</span><span class="n">create</span><span class="p">(</span><span class="ss">key</span><span class="p">:</span> <span class="n">key</span><span class="p">,</span> <span class="ss">value</span><span class="p">:</span> <span class="n">values</span><span class="o">.</span><span class="n">sample</span><span class="p">)</span>
</span><span class='line'>  <span class="p">}</span>
</span><span class='line'>  <span class="n">foo</span><span class="o">.</span><span class="n">save</span>
</span><span class='line'><span class="p">}</span>
</span></code></pre></td></tr></table></div></figure>


<h2>A Path Lesser Traveled</h2>

<p>Normally when you start to look into ActiveRecord Queries, you&rsquo;re going to see some code that looks something like this:</p>

<figure class='code'><figcaption><span></span></figcaption><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
</pre></td><td class='code'><pre><code class='ruby'><span class='line'><span class="no">Foo</span><span class="o">.</span><span class="n">where</span><span class="p">(</span><span class="ss">a</span><span class="p">:</span> <span class="mi">1</span><span class="p">,</span> <span class="ss">b</span><span class="p">:</span> <span class="mi">2</span><span class="p">,</span> <span class="ss">c</span><span class="p">:</span> <span class="mi">3</span><span class="p">)</span>
</span></code></pre></td></tr></table></div></figure>


<p>or perhaps you&rsquo;ll see the normal escaped queries:</p>

<figure class='code'><figcaption><span></span></figcaption><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
</pre></td><td class='code'><pre><code class='ruby'><span class='line'><span class="no">Foo</span><span class="o">.</span><span class="n">where</span><span class="p">(</span><span class="s1">&#39;a = ? AND b = ? AND c = ?&#39;</span><span class="p">,</span> <span class="mi">1</span><span class="p">,</span> <span class="mi">2</span><span class="p">,</span> <span class="mi">3</span><span class="p">)</span>
</span></code></pre></td></tr></table></div></figure>


<p>&hellip;but lurking in the documentation you&rsquo;ll find another way entirely that&rsquo;s not so often advertised by guides, allowing us the same power as the string based conditionals with what I would argue as a lot more clear way.</p>

<figure class='code'><figcaption><span></span></figcaption><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
</pre></td><td class='code'><pre><code class='ruby'><span class='line'><span class="no">Foo</span><span class="o">.</span><span class="n">where</span><span class="p">(</span><span class="s1">&#39;a = :a AND b = :b AND c = :c&#39;</span><span class="p">,</span> <span class="p">{</span><span class="ss">a</span><span class="p">:</span> <span class="mi">1</span><span class="p">,</span> <span class="ss">b</span><span class="p">:</span> <span class="mi">2</span><span class="p">,</span> <span class="ss">c</span><span class="p">:</span><span class="mi">3</span><span class="p">})</span>
</span></code></pre></td></tr></table></div></figure>


<p>So why not just use the first variant like any sane developer, you might wonder. Put simply, because this allows us the full leverage of string conditionals with a lot more clarity. Try and do this with the hash syntax:</p>

<figure class='code'><figcaption><span></span></figcaption><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
<span class='line-number'>3</span>
<span class='line-number'>4</span>
</pre></td><td class='code'><pre><code class='ruby'><span class='line'><span class="no">Foo</span><span class="o">.</span><span class="n">where</span><span class="p">(</span>
</span><span class='line'>  <span class="s1">&#39;a LIKE :a OR b LIKE :b AND created_at &gt; :date AND length(:c) &gt; 5&#39;</span><span class="p">,</span>
</span><span class='line'>  <span class="p">{</span><span class="ss">a</span><span class="p">:</span> <span class="s1">&#39;a%&#39;</span><span class="p">,</span> <span class="ss">b</span><span class="p">:</span> <span class="s1">&#39;b%&#39;</span><span class="p">,</span> <span class="ss">c</span><span class="p">:</span> <span class="mi">5</span><span class="p">,</span> <span class="ss">date</span><span class="p">:</span> <span class="mi">20</span><span class="o">.</span><span class="n">days</span><span class="o">.</span><span class="n">ago</span><span class="p">}</span>
</span><span class='line'><span class="p">)</span>
</span></code></pre></td></tr></table></div></figure>


<h2>Counting associations</h2>

<p>Say you want the count of tags on a foo, how would you go about it?</p>

<figure class='code'><figcaption><span></span></figcaption><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
</pre></td><td class='code'><pre><code class='ruby'><span class='line'><span class="no">Foo</span><span class="o">.</span><span class="n">joins</span><span class="p">(</span><span class="ss">:tags</span><span class="p">)</span><span class="o">.</span><span class="n">group</span><span class="p">(</span><span class="s1">&#39;foos.id&#39;</span><span class="p">)</span><span class="o">.</span><span class="n">count</span><span class="p">(</span><span class="s1">&#39;tags.id&#39;</span><span class="p">)</span>
</span></code></pre></td></tr></table></div></figure>


<p>Now the conundrum here is why do we need to use group? Let&rsquo;s take a look at the generated SQL:</p>

<figure class='code'><figcaption><span></span></figcaption><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
</pre></td><td class='code'><pre><code class='ruby'><span class='line'><span class="no">Foo</span><span class="o">.</span><span class="n">joins</span><span class="p">(</span><span class="ss">:tags</span><span class="p">)</span><span class="o">.</span><span class="n">group</span><span class="p">(</span><span class="s1">&#39;foos.id&#39;</span><span class="p">)</span><span class="o">.</span><span class="n">count</span><span class="p">(</span><span class="s1">&#39;tags.id&#39;</span><span class="p">)</span>
</span><span class='line'>   <span class="p">(</span><span class="mi">0</span><span class="o">.</span><span class="mi">5</span><span class="n">ms</span><span class="p">)</span>  <span class="no">SELECT</span> <span class="no">COUNT</span><span class="p">(</span><span class="n">tags</span><span class="o">.</span><span class="n">id</span><span class="p">)</span> <span class="no">AS</span> <span class="n">count_tags_id</span><span class="p">,</span> <span class="n">foos</span><span class="o">.</span><span class="n">id</span> <span class="no">AS</span> <span class="n">foos_id</span> <span class="no">FROM</span> <span class="s2">&quot;foos&quot;</span> <span class="no">INNER</span> <span class="no">JOIN</span> <span class="s2">&quot;tags&quot;</span> <span class="no">ON</span> <span class="s2">&quot;tags&quot;</span><span class="o">.</span><span class="s2">&quot;foo_id&quot;</span> <span class="o">=</span> <span class="s2">&quot;foos&quot;</span><span class="o">.</span><span class="s2">&quot;id&quot;</span> <span class="no">GROUP</span> <span class="no">BY</span> <span class="n">foos</span><span class="o">.</span><span class="n">id</span>
</span></code></pre></td></tr></table></div></figure>


<p>In order to run a count on an association, we need to aggregate the records into groups that we&rsquo;ll run the count against. Handy thing is, this allows us to do a few more&hellip; interesting things:</p>

<figure class='code'><figcaption><span></span></figcaption><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
<span class='line-number'>3</span>
</pre></td><td class='code'><pre><code class='ruby'><span class='line'><span class="o">[</span><span class="mi">35</span><span class="o">]</span> <span class="n">pry</span><span class="p">(</span><span class="n">main</span><span class="p">)</span><span class="o">&gt;</span> <span class="no">Foo</span><span class="o">.</span><span class="n">joins</span><span class="p">(</span><span class="ss">:tags</span><span class="p">)</span><span class="o">.</span><span class="n">group</span><span class="p">(</span><span class="s1">&#39;tags.key&#39;</span><span class="p">)</span><span class="o">.</span><span class="n">count</span><span class="p">(</span><span class="s1">&#39;tags.id&#39;</span><span class="p">)</span>
</span><span class='line'>   <span class="p">(</span><span class="mi">0</span><span class="o">.</span><span class="mi">6</span><span class="n">ms</span><span class="p">)</span>  <span class="no">SELECT</span> <span class="no">COUNT</span><span class="p">(</span><span class="n">tags</span><span class="o">.</span><span class="n">id</span><span class="p">)</span> <span class="no">AS</span> <span class="n">count_tags_id</span><span class="p">,</span> <span class="n">tags</span><span class="o">.</span><span class="n">key</span> <span class="no">AS</span> <span class="n">tags_key</span> <span class="no">FROM</span> <span class="s2">&quot;foos&quot;</span> <span class="no">INNER</span> <span class="no">JOIN</span> <span class="s2">&quot;tags&quot;</span> <span class="no">ON</span> <span class="s2">&quot;tags&quot;</span><span class="o">.</span><span class="s2">&quot;foo_id&quot;</span> <span class="o">=</span> <span class="s2">&quot;foos&quot;</span><span class="o">.</span><span class="s2">&quot;id&quot;</span> <span class="no">GROUP</span> <span class="no">BY</span> <span class="n">tags</span><span class="o">.</span><span class="n">key</span>
</span><span class='line'><span class="o">=&gt;</span> <span class="p">{</span><span class="s2">&quot;color&quot;</span><span class="o">=&gt;</span><span class="mi">100</span><span class="p">,</span> <span class="s2">&quot;name&quot;</span><span class="o">=&gt;</span><span class="mi">100</span><span class="p">,</span> <span class="s2">&quot;place&quot;</span><span class="o">=&gt;</span><span class="mi">100</span><span class="p">}</span>
</span></code></pre></td></tr></table></div></figure>


<p>You can run aggregates for different groups, not just the supposedly common case, hence why AR wants you to specify it. In the first case, we&rsquo;re simply telling it to aggregate the tags based on the id of their parent.</p>

<h2>Finding multiple matching tags</h2>

<p>I&rsquo;m going to put a disclaimer here and say that trying to play for Single Table Inheritance hacks like this will cause you a lot more harm than good. Now if your data model is not so friendly and forces you into this, it&rsquo;s something worth remembering.</p>

<h3>Aliased Inner Joins</h3>

<p>SQL has a concept for this, but AR currently does not give us this power. Thankfully we have access to <code>find_by_sql</code>:</p>

<figure class='code'><figcaption><span></span></figcaption><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
<span class='line-number'>3</span>
<span class='line-number'>4</span>
<span class='line-number'>5</span>
<span class='line-number'>6</span>
<span class='line-number'>7</span>
<span class='line-number'>8</span>
<span class='line-number'>9</span>
<span class='line-number'>10</span>
<span class='line-number'>11</span>
<span class='line-number'>12</span>
<span class='line-number'>13</span>
<span class='line-number'>14</span>
<span class='line-number'>15</span>
<span class='line-number'>16</span>
<span class='line-number'>17</span>
<span class='line-number'>18</span>
<span class='line-number'>19</span>
<span class='line-number'>20</span>
<span class='line-number'>21</span>
<span class='line-number'>22</span>
<span class='line-number'>23</span>
<span class='line-number'>24</span>
<span class='line-number'>25</span>
<span class='line-number'>26</span>
<span class='line-number'>27</span>
<span class='line-number'>28</span>
<span class='line-number'>29</span>
<span class='line-number'>30</span>
<span class='line-number'>31</span>
<span class='line-number'>32</span>
<span class='line-number'>33</span>
<span class='line-number'>34</span>
<span class='line-number'>35</span>
<span class='line-number'>36</span>
<span class='line-number'>37</span>
<span class='line-number'>38</span>
<span class='line-number'>39</span>
<span class='line-number'>40</span>
<span class='line-number'>41</span>
</pre></td><td class='code'><pre><code class='ruby'><span class='line'><span class="n">search_tags</span> <span class="o">=</span> <span class="o">[</span>
</span><span class='line'>  <span class="p">{</span><span class="ss">key</span><span class="p">:</span> <span class="s1">&#39;name&#39;</span><span class="p">,</span> <span class="ss">value</span><span class="p">:</span> <span class="s1">&#39;David Tennant&#39;</span><span class="p">},</span>
</span><span class='line'>  <span class="p">{</span><span class="ss">key</span><span class="p">:</span> <span class="s1">&#39;color&#39;</span><span class="p">,</span> <span class="ss">value</span><span class="p">:</span> <span class="s1">&#39;Blue&#39;</span><span class="p">}</span>
</span><span class='line'><span class="o">]</span>
</span><span class='line'>
</span><span class='line'><span class="n">aliased_tags</span> <span class="o">=</span> <span class="n">search_tags</span><span class="o">.</span><span class="n">map</span><span class="o">.</span><span class="n">with_index</span> <span class="p">{</span> <span class="o">|</span><span class="n">tag</span><span class="p">,</span> <span class="n">i</span><span class="o">|</span> <span class="o">[</span><span class="s2">&quot;t</span><span class="si">#{</span><span class="n">i</span><span class="si">}</span><span class="s2">&quot;</span><span class="p">,</span> <span class="n">tag</span><span class="o">]</span> <span class="p">}</span><span class="o">.</span><span class="n">to_h</span>
</span><span class='line'>
</span><span class='line'><span class="n">sql_data</span> <span class="o">=</span> <span class="n">aliased_tags</span><span class="o">.</span><span class="n">reduce</span><span class="p">({</span>
</span><span class='line'>  <span class="ss">sql</span><span class="p">:</span> <span class="s1">&#39;&#39;</span><span class="p">,</span> <span class="ss">data</span><span class="p">:</span> <span class="o">[]</span><span class="p">,</span> <span class="ss">where</span><span class="p">:</span> <span class="p">{}</span>
</span><span class='line'><span class="p">})</span> <span class="p">{</span> <span class="o">|</span><span class="n">state</span><span class="p">,</span> <span class="p">(</span><span class="n">i</span><span class="p">,</span> <span class="n">tag</span><span class="p">)</span><span class="o">|</span>
</span><span class='line'>  <span class="n">state</span><span class="o">[</span><span class="ss">:sql</span><span class="o">]</span>  <span class="o">&lt;&lt;</span> <span class="s2">&quot; INNER JOIN tags AS </span><span class="si">#{</span><span class="n">i</span><span class="si">}</span><span class="s2"> ON </span><span class="si">#{</span><span class="n">i</span><span class="si">}</span><span class="s2">.key = ? &quot;</span>
</span><span class='line'>  <span class="n">state</span><span class="o">[</span><span class="ss">:data</span><span class="o">]</span> <span class="o">&lt;&lt;</span> <span class="n">tag</span><span class="o">[</span><span class="ss">:key</span><span class="o">]</span>
</span><span class='line'>
</span><span class='line'>  <span class="n">state</span><span class="o">[</span><span class="ss">:where</span><span class="o">].</span><span class="n">merge!</span><span class="p">(</span><span class="n">i</span> <span class="o">=&gt;</span> <span class="p">{</span><span class="s2">&quot;</span><span class="si">#{</span><span class="n">i</span><span class="si">}</span><span class="s2">_value&quot;</span> <span class="o">=&gt;</span> <span class="n">tag</span><span class="o">[</span><span class="ss">:value</span><span class="o">]</span><span class="p">})</span>
</span><span class='line'>
</span><span class='line'>  <span class="n">state</span>
</span><span class='line'><span class="p">}</span>
</span><span class='line'>
</span><span class='line'><span class="n">where_clause</span> <span class="o">=</span> <span class="n">sql_data</span><span class="o">[</span><span class="ss">:where</span><span class="o">].</span><span class="n">reduce</span><span class="p">({</span>
</span><span class='line'>  <span class="ss">sql_fragments</span><span class="p">:</span> <span class="o">[]</span><span class="p">,</span> <span class="ss">data</span><span class="p">:</span> <span class="p">{}</span>
</span><span class='line'><span class="p">})</span> <span class="p">{</span> <span class="o">|</span><span class="n">state</span><span class="p">,</span> <span class="p">(</span><span class="n">i</span><span class="p">,</span> <span class="n">tag</span><span class="p">)</span><span class="o">|</span>
</span><span class='line'>  <span class="n">state</span><span class="o">[</span><span class="ss">:sql_fragments</span><span class="o">]</span> <span class="o">&lt;&lt;</span> <span class="s2">&quot;</span><span class="si">#{</span><span class="n">i</span><span class="si">}</span><span class="s2">.value = :</span><span class="si">#{</span><span class="n">tag</span><span class="o">.</span><span class="n">keys</span><span class="o">.</span><span class="n">first</span><span class="si">}</span><span class="s2">&quot;</span>
</span><span class='line'>  <span class="n">state</span><span class="o">[</span><span class="ss">:data</span><span class="o">].</span><span class="n">merge!</span><span class="p">(</span><span class="n">tag</span><span class="o">.</span><span class="n">symbolize_keys</span><span class="p">)</span>
</span><span class='line'>  <span class="n">state</span>
</span><span class='line'><span class="p">}</span>
</span><span class='line'>
</span><span class='line'><span class="n">where_sql</span> <span class="o">=</span> <span class="no">Foo</span><span class="o">.</span><span class="n">where</span><span class="p">(</span>
</span><span class='line'>  <span class="n">where_clause</span><span class="o">[</span><span class="ss">:sql_fragments</span><span class="o">].</span><span class="n">join</span><span class="p">(</span><span class="s1">&#39; AND &#39;</span><span class="p">),</span>
</span><span class='line'>  <span class="n">where_clause</span><span class="o">[</span><span class="ss">:data</span><span class="o">]</span>
</span><span class='line'><span class="p">)</span><span class="o">.</span><span class="n">to_sql</span>
</span><span class='line'>
</span><span class='line'><span class="n">select_sql</span><span class="p">,</span> <span class="n">new_where_sql</span> <span class="o">=</span> <span class="n">where_sql</span><span class="o">.</span><span class="n">split</span><span class="p">(</span><span class="s1">&#39;WHERE&#39;</span><span class="p">)</span>
</span><span class='line'><span class="n">final_sql</span> <span class="o">=</span> <span class="n">select_sql</span> <span class="o">+</span> <span class="s1">&#39; &#39;</span> <span class="o">+</span> <span class="n">sql_data</span><span class="o">[</span><span class="ss">:sql</span><span class="o">]</span> <span class="o">+</span> <span class="s2">&quot; WHERE &quot;</span> <span class="o">+</span> <span class="n">where_sql</span>
</span><span class='line'>
</span><span class='line'><span class="no">Foo</span><span class="o">.</span><span class="n">find_by_sql</span><span class="p">(</span><span class="o">[</span><span class="n">final_sql</span><span class="p">,</span> <span class="o">*</span><span class="n">sql_data</span><span class="o">[</span><span class="ss">:data</span><span class="o">]]</span><span class="p">)</span>
</span><span class='line'>
</span><span class='line'><span class="c1"># The final SQL looks something like this:</span>
</span><span class='line'><span class="no">SELECT</span> <span class="s2">&quot;foos&quot;</span><span class="o">.</span><span class="n">*</span> <span class="no">FROM</span> <span class="s2">&quot;foos&quot;</span>
</span><span class='line'>  <span class="no">INNER</span> <span class="no">JOIN</span> <span class="n">tags</span> <span class="no">AS</span> <span class="n">t0</span> <span class="no">ON</span> <span class="n">t0</span><span class="o">.</span><span class="n">key</span> <span class="o">=</span> <span class="s1">&#39;name&#39;</span>
</span><span class='line'>  <span class="no">INNER</span> <span class="no">JOIN</span> <span class="n">tags</span> <span class="no">AS</span> <span class="n">t1</span> <span class="no">ON</span> <span class="n">t1</span><span class="o">.</span><span class="n">key</span> <span class="o">=</span> <span class="s1">&#39;color&#39;</span>
</span><span class='line'>  <span class="no">WHERE</span> <span class="p">(</span><span class="n">t0</span><span class="o">.</span><span class="n">value</span> <span class="o">=</span> <span class="s1">&#39;David Tennant&#39;</span> <span class="no">AND</span> <span class="n">t1</span><span class="o">.</span><span class="n">value</span> <span class="o">=</span> <span class="s1">&#39;Blue&#39;</span><span class="p">)</span>
</span></code></pre></td></tr></table></div></figure>


<p>Now note that this is most certainly not the best way to go about this, suggestions are quite welcome as to better ways to deal with this one.</p>

<p>What we&rsquo;re doing here is in essence creating an on-the-fly single table inheritance to query against.</p>

<p>Admittedly a better way to do this currently eludes me, and I would recommend against using this on your own solutions.</p>

<p>One can use subqueries to circumvent this type of issue, but the solution will be similar if not slower.</p>

<h2>Finishing up</h2>

<p>As you can see, this design of ours quickly devolves into madness when querying against at the end of the article. In the next sections, I&rsquo;ll be covering methods of database design to avoid these issues as much as possible.</p>
]]></content>
  </entry>
  
  <entry>
    <title type="html"><![CDATA[Aggregate Active Record]]></title>
    <link href="http://www.baweaver.com/blog/2015/09/07/aggregate-active-record/"/>
    <updated>2015-09-07T18:49:41-07:00</updated>
    <id>http://www.baweaver.com/blog/2015/09/07/aggregate-active-record</id>
    <content type="html"><![CDATA[<p>Active Record is an extremely powerful abstraction on SQL, but many a Rails programmer tends to forget that that means the entirety of the SQL standard. While it might be common knowledge for some, aggregate queries seem to be missing from the toolkit of a newer rails programmer.</p>

<!-- more -->


<p>For this we&rsquo;ll be using a model called <code>Foo</code> with the fields a, b, and c. All of the fields are strings with random words chosen from OSX&rsquo;s built in wordlist:</p>

<figure class='code'><figcaption><span>db_seed</span></figcaption><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
<span class='line-number'>3</span>
<span class='line-number'>4</span>
</pre></td><td class='code'><pre><code class='ruby'><span class='line'><span class="c1"># rails g model foo a b c</span>
</span><span class='line'><span class="n">words</span>   <span class="o">=</span> <span class="no">IO</span><span class="o">.</span><span class="n">readlines</span><span class="p">(</span><span class="s1">&#39;/usr/share/dict/words&#39;</span><span class="p">)</span><span class="o">.</span><span class="n">flat_map</span> <span class="p">{</span> <span class="o">|</span><span class="n">w</span><span class="o">|</span> <span class="n">w</span><span class="o">.</span><span class="n">chomp</span><span class="o">.</span><span class="n">downcase</span> <span class="p">}</span>
</span><span class='line'><span class="n">records</span> <span class="o">=</span> <span class="mi">10_000</span><span class="o">.</span><span class="n">times</span><span class="o">.</span><span class="n">map</span> <span class="p">{</span> <span class="o">|</span><span class="n">i</span><span class="o">|</span> <span class="p">{</span><span class="ss">a</span><span class="p">:</span> <span class="n">words</span><span class="o">.</span><span class="n">sample</span><span class="p">,</span> <span class="ss">b</span><span class="p">:</span> <span class="n">words</span><span class="o">.</span><span class="n">sample</span><span class="p">,</span> <span class="ss">c</span><span class="p">:</span> <span class="n">words</span><span class="o">.</span><span class="n">sample</span><span class="p">}</span> <span class="p">}</span>
</span><span class='line'><span class="no">Foo</span><span class="o">.</span><span class="n">create</span><span class="p">(</span><span class="n">records</span><span class="p">)</span>
</span></code></pre></td></tr></table></div></figure>


<h1>Count</h1>

<p>How many times have you done this?</p>

<figure class='code'><figcaption><span></span></figcaption><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
</pre></td><td class='code'><pre><code class='ruby'><span class='line'><span class="no">Model</span><span class="o">.</span><span class="n">all</span><span class="o">.</span><span class="n">size</span>
</span></code></pre></td></tr></table></div></figure>


<p>The problem with this one is quite simply that it&rsquo;s retrieving all the records just to get a count. Seem inefficient? It is:</p>

<figure class='code'><figcaption><span></span></figcaption><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
</pre></td><td class='code'><pre><code class='ruby'><span class='line'><span class="o">[</span><span class="mi">3</span><span class="o">]</span> <span class="n">pry</span><span class="p">(</span><span class="n">main</span><span class="p">)</span><span class="o">&gt;</span> <span class="no">Foo</span><span class="o">.</span><span class="n">all</span>
</span><span class='line'>  <span class="no">Foo</span> <span class="no">Load</span> <span class="p">(</span><span class="mi">33</span><span class="o">.</span><span class="mi">7</span><span class="n">ms</span><span class="p">)</span>  <span class="no">SELECT</span> <span class="s2">&quot;foos&quot;</span><span class="o">.</span><span class="n">*</span> <span class="no">FROM</span> <span class="s2">&quot;foos&quot;</span>
</span></code></pre></td></tr></table></div></figure>


<p>Instead, use the count method:</p>

<figure class='code'><figcaption><span></span></figcaption><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
</pre></td><td class='code'><pre><code class='ruby'><span class='line'><span class="o">[</span><span class="mi">4</span><span class="o">]</span> <span class="n">pry</span><span class="p">(</span><span class="n">main</span><span class="p">)</span><span class="o">&gt;</span> <span class="no">Foo</span><span class="o">.</span><span class="n">count</span>
</span><span class='line'>   <span class="p">(</span><span class="mi">0</span><span class="o">.</span><span class="mi">2</span><span class="n">ms</span><span class="p">)</span>  <span class="no">SELECT</span> <span class="no">COUNT</span><span class="p">(</span><span class="o">*</span><span class="p">)</span> <span class="no">FROM</span> <span class="s2">&quot;foos&quot;</span>
</span></code></pre></td></tr></table></div></figure>


<p>Let&rsquo;s go ahead and blank out the <code>a</code> field for the first thousand or so records. Note that I don&rsquo;t use first, as that returns an array:</p>

<figure class='code'><figcaption><span></span></figcaption><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
<span class='line-number'>3</span>
</pre></td><td class='code'><pre><code class='ruby'><span class='line'><span class="o">[</span><span class="mi">8</span><span class="o">]</span> <span class="n">pry</span><span class="p">(</span><span class="n">main</span><span class="p">)</span><span class="o">&gt;</span> <span class="no">Foo</span><span class="o">.</span><span class="n">where</span><span class="p">(</span><span class="s1">&#39;id &lt; 1000&#39;</span><span class="p">)</span><span class="o">.</span><span class="n">update_all</span><span class="p">(</span><span class="ss">a</span><span class="p">:</span> <span class="kp">nil</span><span class="p">)</span>
</span><span class='line'>  <span class="no">SQL</span> <span class="p">(</span><span class="mi">3</span><span class="o">.</span><span class="mi">5</span><span class="n">ms</span><span class="p">)</span>  <span class="no">UPDATE</span> <span class="s2">&quot;foos&quot;</span> <span class="no">SET</span> <span class="s2">&quot;a&quot;</span> <span class="o">=</span> <span class="no">NULL</span> <span class="no">WHERE</span> <span class="p">(</span><span class="nb">id</span> <span class="o">&lt;</span> <span class="mi">1000</span><span class="p">)</span>
</span><span class='line'><span class="o">=&gt;</span> <span class="mi">999</span>
</span></code></pre></td></tr></table></div></figure>


<p>Now how would we get the count of records where <code>a</code> is present? Count takes arguments:</p>

<figure class='code'><figcaption><span></span></figcaption><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
<span class='line-number'>3</span>
</pre></td><td class='code'><pre><code class='ruby'><span class='line'><span class="o">[</span><span class="mi">9</span><span class="o">]</span> <span class="n">pry</span><span class="p">(</span><span class="n">main</span><span class="p">)</span><span class="o">&gt;</span> <span class="no">Foo</span><span class="o">.</span><span class="n">count</span><span class="p">(</span><span class="ss">:a</span><span class="p">)</span>
</span><span class='line'>   <span class="p">(</span><span class="mi">1</span><span class="o">.</span><span class="mi">5</span><span class="n">ms</span><span class="p">)</span>  <span class="no">SELECT</span> <span class="no">COUNT</span><span class="p">(</span><span class="s2">&quot;foos&quot;</span><span class="o">.</span><span class="s2">&quot;a&quot;</span><span class="p">)</span> <span class="no">FROM</span> <span class="s2">&quot;foos&quot;</span>
</span><span class='line'><span class="o">=&gt;</span> <span class="mi">9001</span>
</span></code></pre></td></tr></table></div></figure>


<p>Would you look at that, it&rsquo;s over 9000!</p>

<h1>Group</h1>

<p>Let&rsquo;s say we want to group our records by their length to find out how many words there are for a certain length. Ruby has a built in <code>group_by</code> method:</p>

<figure class='code'><figcaption><span></span></figcaption><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
<span class='line-number'>3</span>
<span class='line-number'>4</span>
<span class='line-number'>5</span>
<span class='line-number'>6</span>
<span class='line-number'>7</span>
<span class='line-number'>8</span>
<span class='line-number'>9</span>
<span class='line-number'>10</span>
<span class='line-number'>11</span>
<span class='line-number'>12</span>
<span class='line-number'>13</span>
<span class='line-number'>14</span>
<span class='line-number'>15</span>
<span class='line-number'>16</span>
<span class='line-number'>17</span>
<span class='line-number'>18</span>
<span class='line-number'>19</span>
<span class='line-number'>20</span>
<span class='line-number'>21</span>
<span class='line-number'>22</span>
<span class='line-number'>23</span>
<span class='line-number'>24</span>
<span class='line-number'>25</span>
</pre></td><td class='code'><pre><code class='ruby'><span class='line'><span class="o">[</span><span class="mi">28</span><span class="o">]</span> <span class="n">pry</span><span class="p">(</span><span class="n">main</span><span class="p">)</span><span class="o">&gt;</span> <span class="no">Foo</span><span class="o">.</span><span class="n">all</span><span class="o">.</span><span class="n">group_by</span> <span class="p">{</span> <span class="o">|</span><span class="n">v</span><span class="o">|</span> <span class="n">v</span><span class="o">.</span><span class="n">a</span><span class="o">.</span><span class="n">try</span><span class="p">(</span><span class="ss">:size</span><span class="p">)</span> <span class="p">}</span><span class="o">.</span><span class="n">map</span> <span class="p">{</span> <span class="o">|</span><span class="n">k</span><span class="p">,</span><span class="n">v</span><span class="o">|</span> <span class="o">[</span><span class="n">k</span><span class="p">,</span><span class="n">v</span><span class="o">.</span><span class="n">size</span><span class="o">]</span> <span class="p">}</span><span class="o">.</span><span class="n">to_h</span>
</span><span class='line'>  <span class="no">Foo</span> <span class="no">Load</span> <span class="p">(</span><span class="mi">22</span><span class="o">.</span><span class="mi">5</span><span class="n">ms</span><span class="p">)</span>  <span class="no">SELECT</span> <span class="s2">&quot;foos&quot;</span><span class="o">.</span><span class="n">*</span> <span class="no">FROM</span> <span class="s2">&quot;foos&quot;</span>
</span><span class='line'><span class="o">=&gt;</span> <span class="p">{</span><span class="kp">nil</span><span class="o">=&gt;</span><span class="mi">999</span><span class="p">,</span>
</span><span class='line'> <span class="mi">14</span><span class="o">=&gt;</span><span class="mi">348</span><span class="p">,</span>
</span><span class='line'> <span class="mi">10</span><span class="o">=&gt;</span><span class="mi">1246</span><span class="p">,</span>
</span><span class='line'> <span class="mi">11</span><span class="o">=&gt;</span><span class="mi">992</span><span class="p">,</span>
</span><span class='line'> <span class="mi">4</span><span class="o">=&gt;</span><span class="mi">204</span><span class="p">,</span>
</span><span class='line'> <span class="mi">8</span><span class="o">=&gt;</span><span class="mi">1166</span><span class="p">,</span>
</span><span class='line'> <span class="mi">9</span><span class="o">=&gt;</span><span class="mi">1262</span><span class="p">,</span>
</span><span class='line'> <span class="mi">5</span><span class="o">=&gt;</span><span class="mi">404</span><span class="p">,</span>
</span><span class='line'> <span class="mi">6</span><span class="o">=&gt;</span><span class="mi">630</span><span class="p">,</span>
</span><span class='line'> <span class="mi">7</span><span class="o">=&gt;</span><span class="mi">856</span><span class="p">,</span>
</span><span class='line'> <span class="mi">12</span><span class="o">=&gt;</span><span class="mi">795</span><span class="p">,</span>
</span><span class='line'> <span class="mi">16</span><span class="o">=&gt;</span><span class="mi">117</span><span class="p">,</span>
</span><span class='line'> <span class="mi">13</span><span class="o">=&gt;</span><span class="mi">576</span><span class="p">,</span>
</span><span class='line'> <span class="mi">15</span><span class="o">=&gt;</span><span class="mi">220</span><span class="p">,</span>
</span><span class='line'> <span class="mi">17</span><span class="o">=&gt;</span><span class="mi">65</span><span class="p">,</span>
</span><span class='line'> <span class="mi">3</span><span class="o">=&gt;</span><span class="mi">51</span><span class="p">,</span>
</span><span class='line'> <span class="mi">18</span><span class="o">=&gt;</span><span class="mi">40</span><span class="p">,</span>
</span><span class='line'> <span class="mi">19</span><span class="o">=&gt;</span><span class="mi">14</span><span class="p">,</span>
</span><span class='line'> <span class="mi">2</span><span class="o">=&gt;</span><span class="mi">4</span><span class="p">,</span>
</span><span class='line'> <span class="mi">20</span><span class="o">=&gt;</span><span class="mi">6</span><span class="p">,</span>
</span><span class='line'> <span class="mi">1</span><span class="o">=&gt;</span><span class="mi">2</span><span class="p">,</span>
</span><span class='line'> <span class="mi">22</span><span class="o">=&gt;</span><span class="mi">2</span><span class="p">,</span>
</span><span class='line'> <span class="mi">21</span><span class="o">=&gt;</span><span class="mi">1</span><span class="p">}</span>
</span></code></pre></td></tr></table></div></figure>


<p>That <code>all</code> should be enough of a trigger to start looking for an aggregate method:</p>

<figure class='code'><figcaption><span></span></figcaption><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
<span class='line-number'>3</span>
<span class='line-number'>4</span>
<span class='line-number'>5</span>
<span class='line-number'>6</span>
<span class='line-number'>7</span>
<span class='line-number'>8</span>
<span class='line-number'>9</span>
<span class='line-number'>10</span>
<span class='line-number'>11</span>
<span class='line-number'>12</span>
<span class='line-number'>13</span>
<span class='line-number'>14</span>
<span class='line-number'>15</span>
<span class='line-number'>16</span>
<span class='line-number'>17</span>
<span class='line-number'>18</span>
<span class='line-number'>19</span>
<span class='line-number'>20</span>
<span class='line-number'>21</span>
<span class='line-number'>22</span>
<span class='line-number'>23</span>
<span class='line-number'>24</span>
<span class='line-number'>25</span>
</pre></td><td class='code'><pre><code class='ruby'><span class='line'><span class="o">[</span><span class="mi">29</span><span class="o">]</span> <span class="n">pry</span><span class="p">(</span><span class="n">main</span><span class="p">)</span><span class="o">&gt;</span> <span class="no">Foo</span><span class="o">.</span><span class="n">group</span><span class="p">(</span><span class="s1">&#39;length(a)&#39;</span><span class="p">)</span><span class="o">.</span><span class="n">count</span>
</span><span class='line'>   <span class="p">(</span><span class="mi">9</span><span class="o">.</span><span class="mi">7</span><span class="n">ms</span><span class="p">)</span>  <span class="no">SELECT</span> <span class="no">COUNT</span><span class="p">(</span><span class="o">*</span><span class="p">)</span> <span class="no">AS</span> <span class="n">count_all</span><span class="p">,</span> <span class="n">length</span><span class="p">(</span><span class="n">a</span><span class="p">)</span> <span class="no">AS</span> <span class="n">length_a</span> <span class="no">FROM</span> <span class="s2">&quot;foos&quot;</span> <span class="no">GROUP</span> <span class="no">BY</span> <span class="n">length</span><span class="p">(</span><span class="n">a</span><span class="p">)</span>
</span><span class='line'><span class="o">=&gt;</span> <span class="p">{</span><span class="kp">nil</span><span class="o">=&gt;</span><span class="mi">999</span><span class="p">,</span>
</span><span class='line'> <span class="mi">1</span><span class="o">=&gt;</span><span class="mi">2</span><span class="p">,</span>
</span><span class='line'> <span class="mi">2</span><span class="o">=&gt;</span><span class="mi">4</span><span class="p">,</span>
</span><span class='line'> <span class="mi">3</span><span class="o">=&gt;</span><span class="mi">51</span><span class="p">,</span>
</span><span class='line'> <span class="mi">4</span><span class="o">=&gt;</span><span class="mi">204</span><span class="p">,</span>
</span><span class='line'> <span class="mi">5</span><span class="o">=&gt;</span><span class="mi">404</span><span class="p">,</span>
</span><span class='line'> <span class="mi">6</span><span class="o">=&gt;</span><span class="mi">630</span><span class="p">,</span>
</span><span class='line'> <span class="mi">7</span><span class="o">=&gt;</span><span class="mi">856</span><span class="p">,</span>
</span><span class='line'> <span class="mi">8</span><span class="o">=&gt;</span><span class="mi">1166</span><span class="p">,</span>
</span><span class='line'> <span class="mi">9</span><span class="o">=&gt;</span><span class="mi">1262</span><span class="p">,</span>
</span><span class='line'> <span class="mi">10</span><span class="o">=&gt;</span><span class="mi">1246</span><span class="p">,</span>
</span><span class='line'> <span class="mi">11</span><span class="o">=&gt;</span><span class="mi">992</span><span class="p">,</span>
</span><span class='line'> <span class="mi">12</span><span class="o">=&gt;</span><span class="mi">795</span><span class="p">,</span>
</span><span class='line'> <span class="mi">13</span><span class="o">=&gt;</span><span class="mi">576</span><span class="p">,</span>
</span><span class='line'> <span class="mi">14</span><span class="o">=&gt;</span><span class="mi">348</span><span class="p">,</span>
</span><span class='line'> <span class="mi">15</span><span class="o">=&gt;</span><span class="mi">220</span><span class="p">,</span>
</span><span class='line'> <span class="mi">16</span><span class="o">=&gt;</span><span class="mi">117</span><span class="p">,</span>
</span><span class='line'> <span class="mi">17</span><span class="o">=&gt;</span><span class="mi">65</span><span class="p">,</span>
</span><span class='line'> <span class="mi">18</span><span class="o">=&gt;</span><span class="mi">40</span><span class="p">,</span>
</span><span class='line'> <span class="mi">19</span><span class="o">=&gt;</span><span class="mi">14</span><span class="p">,</span>
</span><span class='line'> <span class="mi">20</span><span class="o">=&gt;</span><span class="mi">6</span><span class="p">,</span>
</span><span class='line'> <span class="mi">21</span><span class="o">=&gt;</span><span class="mi">1</span><span class="p">,</span>
</span><span class='line'> <span class="mi">22</span><span class="o">=&gt;</span><span class="mi">2</span><span class="p">}</span>
</span></code></pre></td></tr></table></div></figure>


<p>SQL functions are perfectly valid in this context, and quite helpful as well. Just using a column name in group, we can group by similar values as well.</p>

<h1>Pluck</h1>

<p>Pluck doesn&rsquo;t just get certain columns from a database, it can also be used for SQL functions. Let&rsquo;s say we want a list of what length of words we have:</p>

<figure class='code'><figcaption><span></span></figcaption><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
<span class='line-number'>3</span>
</pre></td><td class='code'><pre><code class='ruby'><span class='line'><span class="o">[</span><span class="mi">35</span><span class="o">]</span> <span class="n">pry</span><span class="p">(</span><span class="n">main</span><span class="p">)</span><span class="o">&gt;</span> <span class="no">Foo</span><span class="o">.</span><span class="n">pluck</span><span class="p">(</span><span class="s1">&#39;DISTINCT length(a)&#39;</span><span class="p">)</span>
</span><span class='line'>   <span class="p">(</span><span class="mi">3</span><span class="o">.</span><span class="mi">3</span><span class="n">ms</span><span class="p">)</span>  <span class="no">SELECT</span> <span class="no">DISTINCT</span> <span class="n">length</span><span class="p">(</span><span class="n">a</span><span class="p">)</span> <span class="no">FROM</span> <span class="s2">&quot;foos&quot;</span>
</span><span class='line'><span class="o">=&gt;</span> <span class="o">[</span><span class="kp">nil</span><span class="p">,</span> <span class="mi">14</span><span class="p">,</span> <span class="mi">10</span><span class="p">,</span> <span class="mi">11</span><span class="p">,</span> <span class="mi">4</span><span class="p">,</span> <span class="mi">8</span><span class="p">,</span> <span class="mi">9</span><span class="p">,</span> <span class="mi">5</span><span class="p">,</span> <span class="mi">6</span><span class="p">,</span> <span class="mi">7</span><span class="p">,</span> <span class="mi">12</span><span class="p">,</span> <span class="mi">16</span><span class="p">,</span> <span class="mi">13</span><span class="p">,</span> <span class="mi">15</span><span class="p">,</span> <span class="mi">17</span><span class="p">,</span> <span class="mi">3</span><span class="p">,</span> <span class="mi">18</span><span class="p">,</span> <span class="mi">19</span><span class="p">,</span> <span class="mi">2</span><span class="p">,</span> <span class="mi">20</span><span class="p">,</span> <span class="mi">1</span><span class="p">,</span> <span class="mi">22</span><span class="p">,</span> <span class="mi">21</span><span class="o">]</span>
</span></code></pre></td></tr></table></div></figure>


<p>How about the average length of our <code>a</code> column?</p>

<figure class='code'><figcaption><span></span></figcaption><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
<span class='line-number'>3</span>
</pre></td><td class='code'><pre><code class='ruby'><span class='line'><span class="o">[</span><span class="mi">36</span><span class="o">]</span> <span class="n">pry</span><span class="p">(</span><span class="n">main</span><span class="p">)</span><span class="o">&gt;</span> <span class="no">Foo</span><span class="o">.</span><span class="n">pluck</span><span class="p">(</span><span class="s1">&#39;avg(length(a))&#39;</span><span class="p">)</span>
</span><span class='line'>   <span class="p">(</span><span class="mi">2</span><span class="o">.</span><span class="mi">2</span><span class="n">ms</span><span class="p">)</span>  <span class="no">SELECT</span> <span class="n">avg</span><span class="p">(</span><span class="n">length</span><span class="p">(</span><span class="n">a</span><span class="p">))</span> <span class="no">FROM</span> <span class="s2">&quot;foos&quot;</span>
</span><span class='line'><span class="o">=&gt;</span> <span class="o">[</span><span class="mi">9</span><span class="o">.</span><span class="mi">574158426841462</span><span class="o">]</span>
</span></code></pre></td></tr></table></div></figure>


<p>Noted you can use the <code>average(:a)</code> function here as well:</p>

<figure class='code'><figcaption><span></span></figcaption><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
<span class='line-number'>3</span>
</pre></td><td class='code'><pre><code class='ruby'><span class='line'><span class="o">[</span><span class="mi">39</span><span class="o">]</span> <span class="n">pry</span><span class="p">(</span><span class="n">main</span><span class="p">)</span><span class="o">&gt;</span> <span class="no">Foo</span><span class="o">.</span><span class="n">average</span><span class="p">(</span><span class="s1">&#39;length(a)&#39;</span><span class="p">)</span>
</span><span class='line'>   <span class="p">(</span><span class="mi">2</span><span class="o">.</span><span class="mi">9</span><span class="n">ms</span><span class="p">)</span>  <span class="no">SELECT</span> <span class="no">AVG</span><span class="p">(</span><span class="n">length</span><span class="p">(</span><span class="n">a</span><span class="p">))</span> <span class="no">FROM</span> <span class="s2">&quot;foos&quot;</span>
</span><span class='line'><span class="o">=&gt;</span> <span class="c1">#&lt;BigDecimal:7fcf75efd2f0,&#39;0.9574158426 84146E1&#39;,27(36)&gt;</span>
</span></code></pre></td></tr></table></div></figure>


<p>&hellip;but what you cannot do with <code>average</code>, <code>min</code>, <code>max</code>, and other calculation functions is this useful tidbit:</p>

<figure class='code'><figcaption><span></span></figcaption><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
<span class='line-number'>3</span>
</pre></td><td class='code'><pre><code class='ruby'><span class='line'><span class="o">[</span><span class="mi">40</span><span class="o">]</span> <span class="n">pry</span><span class="p">(</span><span class="n">main</span><span class="p">)</span><span class="o">&gt;</span> <span class="no">Foo</span><span class="o">.</span><span class="n">pluck</span><span class="p">(</span><span class="s1">&#39;avg(length(a))&#39;</span><span class="p">,</span> <span class="s1">&#39;max(length(a))&#39;</span><span class="p">,</span> <span class="s1">&#39;min(length(a))&#39;</span><span class="p">,</span> <span class="s1">&#39;count(a)&#39;</span><span class="p">)</span>
</span><span class='line'>   <span class="p">(</span><span class="mi">5</span><span class="o">.</span><span class="mi">4</span><span class="n">ms</span><span class="p">)</span>  <span class="no">SELECT</span> <span class="n">avg</span><span class="p">(</span><span class="n">length</span><span class="p">(</span><span class="n">a</span><span class="p">)),</span> <span class="n">max</span><span class="p">(</span><span class="n">length</span><span class="p">(</span><span class="n">a</span><span class="p">)),</span> <span class="n">min</span><span class="p">(</span><span class="n">length</span><span class="p">(</span><span class="n">a</span><span class="p">)),</span> <span class="n">count</span><span class="p">(</span><span class="n">a</span><span class="p">)</span> <span class="no">FROM</span> <span class="s2">&quot;foos&quot;</span>
</span><span class='line'><span class="o">=&gt;</span> <span class="o">[[</span><span class="mi">9</span><span class="o">.</span><span class="mi">574158426841462</span><span class="p">,</span> <span class="mi">22</span><span class="p">,</span> <span class="mi">1</span><span class="p">,</span> <span class="mi">9001</span><span class="o">]]</span>
</span></code></pre></td></tr></table></div></figure>


<p>That one, without aggregate functions, is likely to take quite a while indeed.</p>

<h1>Calculations</h1>

<p>There are some other common functions that may well come in handy if you only happen to need one value:</p>

<figure class='code'><figcaption><span></span></figcaption><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
<span class='line-number'>3</span>
<span class='line-number'>4</span>
<span class='line-number'>5</span>
<span class='line-number'>6</span>
<span class='line-number'>7</span>
</pre></td><td class='code'><pre><code class='ruby'><span class='line'><span class="o">[</span><span class="mi">42</span><span class="o">]</span> <span class="n">pry</span><span class="p">(</span><span class="n">main</span><span class="p">)</span><span class="o">&gt;</span> <span class="no">Foo</span><span class="o">.</span><span class="n">minimum</span><span class="p">(</span><span class="s1">&#39;length(a)&#39;</span><span class="p">)</span>
</span><span class='line'>   <span class="p">(</span><span class="mi">2</span><span class="o">.</span><span class="mi">4</span><span class="n">ms</span><span class="p">)</span>  <span class="no">SELECT</span> <span class="no">MIN</span><span class="p">(</span><span class="n">length</span><span class="p">(</span><span class="n">a</span><span class="p">))</span> <span class="no">FROM</span> <span class="s2">&quot;foos&quot;</span>
</span><span class='line'><span class="o">=&gt;</span> <span class="mi">1</span>
</span><span class='line'>
</span><span class='line'><span class="o">[</span><span class="mi">43</span><span class="o">]</span> <span class="n">pry</span><span class="p">(</span><span class="n">main</span><span class="p">)</span><span class="o">&gt;</span> <span class="no">Foo</span><span class="o">.</span><span class="n">maximum</span><span class="p">(</span><span class="s1">&#39;length(a)&#39;</span><span class="p">)</span>
</span><span class='line'>   <span class="p">(</span><span class="mi">2</span><span class="o">.</span><span class="mi">2</span><span class="n">ms</span><span class="p">)</span>  <span class="no">SELECT</span> <span class="no">MAX</span><span class="p">(</span><span class="n">length</span><span class="p">(</span><span class="n">a</span><span class="p">))</span> <span class="no">FROM</span> <span class="s2">&quot;foos&quot;</span>
</span><span class='line'><span class="o">=&gt;</span> <span class="mi">22</span>
</span></code></pre></td></tr></table></div></figure>


<h1>Where</h1>

<p>Not an aggregate per-se, but using a where clause can still use SQL functions. Say you only want records with an <code>a</code> field longer than 10 characters:</p>

<figure class='code'><figcaption><span></span></figcaption><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
<span class='line-number'>3</span>
</pre></td><td class='code'><pre><code class='ruby'><span class='line'><span class="o">[</span><span class="mi">44</span><span class="o">]</span> <span class="n">pry</span><span class="p">(</span><span class="n">main</span><span class="p">)</span><span class="o">&gt;</span> <span class="no">Foo</span><span class="o">.</span><span class="n">where</span><span class="p">(</span><span class="s1">&#39;length(a) &gt; 10&#39;</span><span class="p">)</span><span class="o">.</span><span class="n">count</span>
</span><span class='line'>   <span class="p">(</span><span class="mi">2</span><span class="o">.</span><span class="mi">1</span><span class="n">ms</span><span class="p">)</span>  <span class="no">SELECT</span> <span class="no">COUNT</span><span class="p">(</span><span class="o">*</span><span class="p">)</span> <span class="no">FROM</span> <span class="s2">&quot;foos&quot;</span> <span class="no">WHERE</span> <span class="p">(</span><span class="n">length</span><span class="p">(</span><span class="n">a</span><span class="p">)</span> <span class="o">&gt;</span> <span class="mi">10</span><span class="p">)</span>
</span><span class='line'><span class="o">=&gt;</span> <span class="mi">3176</span>
</span></code></pre></td></tr></table></div></figure>


<p>Maybe the count isn&rsquo;t what you&rsquo;re after. Perhaps you want the ids instead?:</p>

<figure class='code'><figcaption><span></span></figcaption><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
<span class='line-number'>3</span>
<span class='line-number'>4</span>
<span class='line-number'>5</span>
<span class='line-number'>6</span>
</pre></td><td class='code'><pre><code class='ruby'><span class='line'><span class="o">[</span><span class="mi">45</span><span class="o">]</span> <span class="n">pry</span><span class="p">(</span><span class="n">main</span><span class="p">)</span><span class="o">&gt;</span> <span class="no">Foo</span><span class="o">.</span><span class="n">where</span><span class="p">(</span><span class="s1">&#39;length(a) &gt; 10&#39;</span><span class="p">)</span><span class="o">.</span><span class="n">ids</span>
</span><span class='line'>   <span class="p">(</span><span class="mi">5</span><span class="o">.</span><span class="mi">0</span><span class="n">ms</span><span class="p">)</span>  <span class="no">SELECT</span> <span class="s2">&quot;foos&quot;</span><span class="o">.</span><span class="s2">&quot;id&quot;</span> <span class="no">FROM</span> <span class="s2">&quot;foos&quot;</span> <span class="no">WHERE</span> <span class="p">(</span><span class="n">length</span><span class="p">(</span><span class="n">a</span><span class="p">)</span> <span class="o">&gt;</span> <span class="mi">10</span><span class="p">)</span>
</span><span class='line'><span class="o">=&gt;</span> <span class="o">[</span><span class="mi">1000</span><span class="p">,</span>
</span><span class='line'> <span class="mi">1002</span><span class="p">,</span>
</span><span class='line'> <span class="mi">1004</span><span class="p">,</span>
</span><span class='line'> <span class="c1"># ...</span>
</span></code></pre></td></tr></table></div></figure>


<p>Never underestimate the value of being familiar with basic functions in SQL, as they&rsquo;ll save your database a lot of headaches.</p>

<h1>Finishing up</h1>

<p>While a strong knowledge of SQL is not always necessary for Rails development, it will most certainly improve your code and your performance. Not everything has to fit into hash arguments for a where clause.</p>
]]></content>
  </entry>
  
  <entry>
    <title type="html"><![CDATA[The Clairvoyant Project]]></title>
    <link href="http://www.baweaver.com/blog/2015/07/04/the-clairvoyant-project/"/>
    <updated>2015-07-04T21:48:54-07:00</updated>
    <id>http://www.baweaver.com/blog/2015/07/04/the-clairvoyant-project</id>
    <content type="html"><![CDATA[<p><a href="https://github.com/baweaver/clairvoyant">The Clairvoyant project</a> is one of my more ambitious personal projects, with one &ldquo;simple&rdquo; goal in mind: <em>Your tests should be able to generate your application code</em></p>

<p>This post will outline the beginnings of the madness that led to Clairvoyant as well as some of the details of how things are planned to be implemented.</p>

<!-- more -->


<h2>Code as Data</h2>

<p>The LISPers among you will notice a very common theme throughout this post. That theme is quite simply that I&rsquo;m taking a ruby file and treating it as data for an entirely different parser.</p>

<p>The DSL is already there, the data set, the question becomes what can we divine from what we have with reasonable certainty?</p>

<h2>Logic Languages</h2>

<p>Along with inspirations from LISP, we&rsquo;re drawing pretty from Logical languages such as Prolog. A logic program is a statement of facts used to derive an answer to a question:</p>

<figure class='code'><figcaption><span></span></figcaption><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
<span class='line-number'>3</span>
<span class='line-number'>4</span>
<span class='line-number'>5</span>
<span class='line-number'>6</span>
<span class='line-number'>7</span>
<span class='line-number'>8</span>
<span class='line-number'>9</span>
<span class='line-number'>10</span>
</pre></td><td class='code'><pre><code class='prolog'><span class='line'><span class="p">{</span><span class="nn">http</span><span class="p">:</span><span class="o">//</span><span class="s-Atom">www</span><span class="p">.</span><span class="s-Atom">csse</span><span class="p">.</span><span class="s-Atom">monash</span><span class="p">.</span><span class="s-Atom">edu</span><span class="p">.</span><span class="s-Atom">au/~lloyd</span><span class="o">/</span><span class="s-Atom">tildeLogic</span><span class="o">/</span><span class="nv">Prolog</span><span class="p">.</span><span class="s-Atom">toy</span><span class="o">/</span><span class="nv">Examples</span><span class="s-Atom">/</span><span class="p">}</span>
</span><span class='line'><span class="nf">witch</span><span class="p">(</span><span class="nv">X</span><span class="p">)</span>  <span class="s-Atom">&lt;=</span> <span class="nf">burns</span><span class="p">(</span><span class="nv">X</span><span class="p">)</span> <span class="s-Atom">and</span> <span class="nf">female</span><span class="p">(</span><span class="nv">X</span><span class="p">).</span>
</span><span class='line'><span class="nf">burns</span><span class="p">(</span><span class="nv">X</span><span class="p">)</span>  <span class="s-Atom">&lt;=</span> <span class="nf">wooden</span><span class="p">(</span><span class="nv">X</span><span class="p">).</span>
</span><span class='line'><span class="nf">wooden</span><span class="p">(</span><span class="nv">X</span><span class="p">)</span> <span class="s-Atom">&lt;=</span> <span class="nf">floats</span><span class="p">(</span><span class="nv">X</span><span class="p">).</span>
</span><span class='line'><span class="nf">floats</span><span class="p">(</span><span class="nv">X</span><span class="p">)</span> <span class="s-Atom">&lt;=</span> <span class="nf">sameweight</span><span class="p">(</span><span class="s-Atom">duck</span><span class="p">,</span> <span class="nv">X</span><span class="p">).</span>
</span><span class='line'>
</span><span class='line'><span class="nf">female</span><span class="p">(</span><span class="s-Atom">girl</span><span class="p">).</span>          <span class="p">{</span><span class="s-Atom">by</span> <span class="s-Atom">observation</span><span class="p">}</span>
</span><span class='line'><span class="nf">sameweight</span><span class="p">(</span><span class="s-Atom">duck</span><span class="p">,</span><span class="s-Atom">girl</span><span class="p">).</span> <span class="p">{</span><span class="s-Atom">by</span> <span class="s-Atom">experiment</span> <span class="p">}</span>
</span><span class='line'>
</span><span class='line'><span class="s-Atom">?</span> <span class="nf">witch</span><span class="p">(</span><span class="s-Atom">girl</span><span class="p">).</span> <span class="p">{</span><span class="nv">Now</span> <span class="s-Atom">we</span> <span class="s-Atom">ask</span> <span class="s-Atom">it</span><span class="p">}</span>
</span></code></pre></td></tr></table></div></figure>


<p>Now I don&rsquo;t pretend to be an expert in Prolog, or even really particularly any good at it. Given that, it still reminds me of something:</p>

<figure class='code'><figcaption><span>rspec-example</span></figcaption><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
<span class='line-number'>3</span>
<span class='line-number'>4</span>
<span class='line-number'>5</span>
<span class='line-number'>6</span>
<span class='line-number'>7</span>
<span class='line-number'>8</span>
<span class='line-number'>9</span>
<span class='line-number'>10</span>
<span class='line-number'>11</span>
<span class='line-number'>12</span>
<span class='line-number'>13</span>
<span class='line-number'>14</span>
<span class='line-number'>15</span>
</pre></td><td class='code'><pre><code class='ruby'><span class='line'><span class="n">describe</span> <span class="no">Witch</span> <span class="k">do</span>
</span><span class='line'>  <span class="n">describe</span> <span class="s1">&#39;#burns&#39;</span> <span class="k">do</span>
</span><span class='line'>    <span class="n">it</span> <span class="s1">&#39;burns&#39;</span> <span class="k">do</span>
</span><span class='line'>      <span class="n">expect</span><span class="p">(</span><span class="n">subject</span><span class="o">.</span><span class="n">burns</span><span class="p">)</span><span class="o">.</span><span class="n">to</span> <span class="n">eq</span><span class="p">(</span><span class="kp">true</span><span class="p">)</span>
</span><span class='line'>    <span class="k">end</span>
</span><span class='line'>  <span class="k">end</span>
</span><span class='line'>
</span><span class='line'>  <span class="n">describe</span> <span class="s1">&#39;#wooden&#39;</span> <span class="k">do</span>
</span><span class='line'>    <span class="n">it</span> <span class="s1">&#39;is made of wood&#39;</span> <span class="k">do</span>
</span><span class='line'>      <span class="n">expect</span><span class="p">(</span><span class="n">subject</span><span class="o">.</span><span class="n">wooden</span><span class="p">)</span><span class="o">.</span><span class="n">to</span> <span class="n">eq</span><span class="p">(</span><span class="kp">true</span><span class="p">)</span>
</span><span class='line'>    <span class="k">end</span>
</span><span class='line'>  <span class="k">end</span>
</span><span class='line'>
</span><span class='line'>  <span class="c1"># ...</span>
</span><span class='line'><span class="k">end</span>
</span></code></pre></td></tr></table></div></figure>


<p>So if RSPEC looks like it&rsquo;s already making assertions about the nature of our program, what happens if we treat it like a logic language?</p>

<h2>Repurposing a DSL</h2>

<p>The thing about Ruby is it&rsquo;s incredibly flexible. The DSL from RSPEC can just as easily be hijacked and run in another compiler with this one simple trick (and I swear I won&rsquo;t clickbait):</p>

<figure class='code'><figcaption><span></span></figcaption><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
<span class='line-number'>3</span>
<span class='line-number'>4</span>
<span class='line-number'>5</span>
<span class='line-number'>6</span>
</pre></td><td class='code'><pre><code class='ruby'><span class='line'><span class="k">class</span> <span class="nc">MyParser</span>
</span><span class='line'>  <span class="k">def</span> <span class="nf">initialize</span><span class="p">(</span><span class="n">file_lines</span><span class="p">)</span>
</span><span class='line'>    <span class="vi">@descriptions</span> <span class="o">=</span> <span class="o">[]</span>
</span><span class='line'>    <span class="nb">self</span><span class="o">.</span><span class="n">class_eval</span><span class="p">(</span><span class="n">file_lines</span><span class="p">)</span>
</span><span class='line'>  <span class="k">end</span>
</span><span class='line'><span class="k">end</span>
</span></code></pre></td></tr></table></div></figure>


<p>Now what have we done? We&rsquo;ve evaluated the entirety of the loaded file in the context of our class. What happens if we redefine describe inside of there?</p>

<figure class='code'><figcaption><span></span></figcaption><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
<span class='line-number'>3</span>
<span class='line-number'>4</span>
<span class='line-number'>5</span>
<span class='line-number'>6</span>
<span class='line-number'>7</span>
<span class='line-number'>8</span>
<span class='line-number'>9</span>
</pre></td><td class='code'><pre><code class='ruby'><span class='line'><span class="k">def</span> <span class="nf">describe</span><span class="p">(</span><span class="n">description</span><span class="p">,</span> <span class="o">&amp;</span><span class="n">block</span><span class="p">)</span>
</span><span class='line'>  <span class="vi">@descriptions</span> <span class="o">&lt;&lt;</span> <span class="n">description</span>
</span><span class='line'>  <span class="n">block</span><span class="o">.</span><span class="n">call</span>
</span><span class='line'><span class="k">end</span>
</span><span class='line'>
</span><span class='line'><span class="c1"># Now when it hits &#39;Witch&#39;, it returns a symbol of the name instead</span>
</span><span class='line'><span class="k">def</span> <span class="nf">const_missing</span><span class="p">(</span><span class="nb">name</span><span class="p">)</span>
</span><span class='line'>  <span class="nb">name</span>
</span><span class='line'><span class="k">end</span>
</span></code></pre></td></tr></table></div></figure>


<p>We can capture the entirety of the <code>describe</code> blocks, or for that matter anything else we want. As long as the spec file isn&rsquo;t using <code>::RSpec.describe</code> we can hijack whatever we want. If it does, it just makes it mildly more annoying to reason about.</p>

<p>This is effectively the current state of Clairvoyant. You can take a <em>very basic</em> spec file and generate a skeleton of it such that:</p>

<figure class='code'><figcaption><span></span></figcaption><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
<span class='line-number'>3</span>
<span class='line-number'>4</span>
<span class='line-number'>5</span>
<span class='line-number'>6</span>
<span class='line-number'>7</span>
</pre></td><td class='code'><pre><code class='ruby'><span class='line'><span class="n">describe</span> <span class="no">Foo</span> <span class="k">do</span>
</span><span class='line'>  <span class="n">describe</span> <span class="s1">&#39;#bar&#39;</span> <span class="k">do</span>
</span><span class='line'>    <span class="n">it</span> <span class="s1">&#39;does something magical&#39;</span> <span class="k">do</span>
</span><span class='line'>      <span class="n">expect</span><span class="p">(</span><span class="n">subject</span><span class="o">.</span><span class="n">bar</span><span class="p">)</span><span class="o">.</span><span class="n">to</span> <span class="n">eq</span><span class="p">(</span><span class="mi">5</span><span class="p">)</span>
</span><span class='line'>    <span class="k">end</span>
</span><span class='line'>  <span class="k">end</span>
</span><span class='line'><span class="k">end</span>
</span></code></pre></td></tr></table></div></figure>


<p>Will generate:</p>

<figure class='code'><figcaption><span></span></figcaption><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
<span class='line-number'>3</span>
<span class='line-number'>4</span>
<span class='line-number'>5</span>
<span class='line-number'>6</span>
<span class='line-number'>7</span>
<span class='line-number'>8</span>
</pre></td><td class='code'><pre><code class='ruby'><span class='line'><span class="k">class</span> <span class="nc">Foo</span>
</span><span class='line'>  <span class="c1"># It does something magical</span>
</span><span class='line'>  <span class="c1">#</span>
</span><span class='line'>  <span class="c1"># @return [Integer]</span>
</span><span class='line'>  <span class="k">def</span> <span class="nf">bar</span>
</span><span class='line'>    <span class="c1"># Code goes here later</span>
</span><span class='line'>  <span class="k">end</span>
</span><span class='line'><span class="k">end</span>
</span></code></pre></td></tr></table></div></figure>


<p>Granted that&rsquo;s not all that impressive at this point, but beyond this stage we&rsquo;re going to find some very interesting problems. This brings me to my next section.</p>

<h2>Theoreticals</h2>

<p>Generating a skeleton is easy enough, and still very useful in its own right. Actually writing code from expectations and matchers that could be near infinite in number and complexity? That becomes a whole different story very quickly. These are theoretical musings of the future nature of Clairvoyant as I see it.</p>

<h3>The nature of the &lsquo;it&rsquo; block</h3>

<p>Past the description, we&rsquo;re defining what the logic of the program is. From here we can infer a good number of relevant details:</p>

<figure class='code'><figcaption><span></span></figcaption><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
<span class='line-number'>3</span>
</pre></td><td class='code'><pre><code class='ruby'><span class='line'><span class="n">it</span> <span class="s1">&#39;does this&#39;</span> <span class="k">do</span>
</span><span class='line'>  <span class="n">expect</span><span class="p">(</span><span class="nb">method</span><span class="p">(</span><span class="n">a</span><span class="p">,</span><span class="n">b</span><span class="p">,</span><span class="n">c</span><span class="p">)</span><span class="o">.</span><span class="n">last</span><span class="p">)</span><span class="o">.</span><span class="n">to</span> <span class="n">eq</span><span class="p">(</span><span class="mi">5</span><span class="p">)</span>
</span><span class='line'><span class="k">end</span>
</span></code></pre></td></tr></table></div></figure>


<p>The description string of the method can be used for documentation and a nifty method description in some cases.</p>

<p>The actual expectation call tells us the name of the method, and potentially anything that we can call after it. In the above example we know that <code>last</code> can be called on the result of our method and the <code>arity</code> of the method <em>can</em> be <code>3</code>. We can also guess that the method is some form of Enumerable, dropping possible options for output substantially. Given that we only have an integer here, we can make a reasonable statement that the return value of method is <code>Array[Integer]</code></p>

<p>We have facts to work with here, and the more <code>it</code> methods we have inside of a <code>describe</code>, the more we can divine from given facts of the method. Say another test called <code>keys</code> on our methods return, now we can reasonably guess it&rsquo;s a <code>Hash</code> or close derivative.</p>

<p>That&rsquo;s well and good, but a lot of ruby methods tend to be very conditional in nature. They could return different things dependent on a <code>context</code>. Luckily we have just such a method we can hijack!</p>

<h3>Contextual contexts abound</h3>

<p>Say we find our method doing strange things dependent on what the <code>context</code> is. We can possibly even derive a conditional from a well laid out <code>context</code> description:</p>

<figure class='code'><figcaption><span></span></figcaption><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
<span class='line-number'>3</span>
</pre></td><td class='code'><pre><code class='ruby'><span class='line'><span class="n">context</span> <span class="s1">&#39;When value is even&#39;</span> <span class="k">do</span>
</span><span class='line'>  <span class="c1"># ...</span>
</span><span class='line'><span class="k">end</span>
</span></code></pre></td></tr></table></div></figure>


<p>We have a potentially bindable name in <code>value</code>, and a proper context test to throw against with <code>is even</code>. Of course at this point it&rsquo;s going to be a lot more difficult to glean this information and will be heavily dependent on the robustness of a tokenization library and parser.</p>

<p>This becomes substantially more difficult to reason about, because now we&rsquo;re trying to tell people how to write their tests instead of divining information from what&rsquo;s already there. That may be fine for new code but can be incredibly tedious to make behave properly.</p>

<h3>Expectational matchers</h3>

<p>Matchers provide an even more interesting challenge, especially factoring in user defined options. We can make some reasonable assertions based on <code>raise error</code> to give some error handling for dynamic methods, but that again becomes dependent on <code>context</code> blocks being clear enough to grok.</p>

<h3>Meta-testing</h3>

<p>Given that we have all of that figured out, the next fun part is proving whether or not what we did even works right. At this point we can run our generated code through the RSPEC again as the core team intended to see if we made it pass. If we did, great! That&rsquo;s the easy case if we&rsquo;ve already gotten this far. If we haven&rsquo;t on the other hand it opens up a whole different can of worms.</p>

<h3>Meta-Meta-testing</h3>

<p>So maybe it didn&rsquo;t pass. Hey! It gave us data back to use to further refine and polish our solutions. We can use that to (hopefully) get them to pass on the next round! At this point the failed tests would be ported back and we could do a few things at this point:</p>

<p>Fail the method and leave a comment for the user or Attempt to repair the method to make the test pass</p>

<p>The first would be far more practical if we&rsquo;ve gotten this far, but hey, we&rsquo;re in theory land. Let&rsquo;s push our luck a bit more here.</p>

<h3>Meta-.*-testing</h3>

<p>At this stage we would be throwing code back and forth until something works, a very brute force solution to hoping we hit the sweet spot. In something that could only be compared to the quandry of monkies writing Shakespear, we might squeeze just a bit more code out of there.</p>

<p>Though honestly, halting problem is just a euphemism for being dull, let&rsquo;s have more fun!</p>

<h3>S-Expression analysis</h3>

<p>So we can get a hold of a lot of your application code as well right? Let&rsquo;s not limit that. Let&rsquo;s grab as much ruby code as we can stuff into memory and try and find patterns between their RSPEC code and application code. Machine learning and deep analysis can be applied to more acurately divine intended code based on community behaviors (though I will explicitly prune out you maniacs who use globals like candy.)</p>

<p>Throw it in a Spark cluster and let the thing roar. We&rsquo;re deep into AI land of making some very interesting code generation black magic happen, and probably well beyond anything that&rsquo;s been attempted up to this point.</p>

<p>I have no qualms saying this is well beyond me, but it sounds like a blast to try anyways.</p>

<h2>You&rsquo;re out of your mind</h2>

<p>It wouldn&rsquo;t be the first time I&rsquo;ve been told this, and certainly won&rsquo;t be the last. This is a personal project and a great deal of fun in learning Ruby internals along the way. Maybe one day this will be a fully functional project that can magically make your wildest dreams come true, or maybe not.</p>

<p>Really, when it gets down to it, that&rsquo;s the fun of it. The potentials here are limitless, the problem hard, and the code plentiful. That&rsquo;s the best type of problem to poke at. It&rsquo;d be no fun if I knew entirely what I was doing.</p>

<p>Check out <a href="https://github.com/baweaver/clairvoyant">Clairvoyant</a>, leave me a comment, let me know what you think!</p>
]]></content>
  </entry>
  
  <entry>
    <title type="html"><![CDATA[Intro to Spark]]></title>
    <link href="http://www.baweaver.com/blog/2015/06/21/intro-to-spark/"/>
    <updated>2015-06-21T18:13:24-07:00</updated>
    <id>http://www.baweaver.com/blog/2015/06/21/intro-to-spark</id>
    <content type="html"><![CDATA[<p>Assuming you&rsquo;ve read the first article on <a href="http://baweaver.com/blog/2015/06/20/a-functional-programming-primer-for-spark/">Functional Programming in Scala and Python</a>, you should be ready to sink your teeth into a few practical Spark problems</p>

<!-- more -->


<h2>Getting Spark</h2>

<p>The first step to running Spark is to get a standalone instance to play with on our machines.</p>

<p>Go to the Spark homepage: <a href="https://spark.apache.org/downloads.html">https://spark.apache.org/downloads.html</a></p>

<p>We&rsquo;ll be using version <code>1.4.0</code>. Select that version from releases, and select Pre-built for Hadoop 2.6 and later (unless you currently have another Hadoop / HDFS instance at a different version.)</p>

<p>Go ahead and download / unpack that into the directory of your choice, and <code>cd</code> into it.</p>

<h2>Getting our wordlist</h2>

<p>We&rsquo;ll be using an <a href="http://www-01.sil.org/linguistics/wordlists/english/">english wordlist from SIL</a> for the following exercises. Make sure to save <code>wordsEn.txt</code> somewhere where you can load it later.</p>

<h2>Spark REPL</h2>

<p>The last tutorial mentioned the concept of a REPL as a way to play with code interactively. Handy enough, Spark implemented its own REPL over Scala and Python (and not Java.)</p>

<p>For Scala that would be <code>bin/spark-shell</code></p>

<p>For Python, it&rsquo;s <code>bin/pyspark</code></p>

<p>You should see something like this (snipped for length):</p>

<figure class='code'><figcaption><span>spark-repl-scala</span></figcaption><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
<span class='line-number'>3</span>
<span class='line-number'>4</span>
<span class='line-number'>5</span>
<span class='line-number'>6</span>
<span class='line-number'>7</span>
<span class='line-number'>8</span>
<span class='line-number'>9</span>
<span class='line-number'>10</span>
</pre></td><td class='code'><pre><code class='text'><span class='line'>Welcome to
</span><span class='line'>      ____              __
</span><span class='line'>     / __/__  ___ _____/ /__
</span><span class='line'>    _\ \/ _ \/ _ `/ __/  &#39;_/
</span><span class='line'>   /___/ .__/\_,_/_/ /_/\_\   version 1.3.1
</span><span class='line'>      /_/
</span><span class='line'>
</span><span class='line'>Using Scala version 2.10.4 (Java HotSpot(TM) 64-Bit Server VM, Java 1.8.0_31)
</span><span class='line'>Type in expressions to have them evaluated.
</span><span class='line'>Type :help for more information.
</span></code></pre></td></tr></table></div></figure>


<p>or this:</p>

<figure class='code'><figcaption><span>spark-repl-python</span></figcaption><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
<span class='line-number'>3</span>
<span class='line-number'>4</span>
<span class='line-number'>5</span>
<span class='line-number'>6</span>
<span class='line-number'>7</span>
<span class='line-number'>8</span>
<span class='line-number'>9</span>
</pre></td><td class='code'><pre><code class='text'><span class='line'>Welcome to
</span><span class='line'>      ____              __
</span><span class='line'>     / __/__  ___ _____/ /__
</span><span class='line'>    _\ \/ _ \/ _ `/ __/  &#39;_/
</span><span class='line'>   /__ / .__/\_,_/_/ /_/\_\   version 1.3.1
</span><span class='line'>      /_/
</span><span class='line'>
</span><span class='line'>Using Python version 2.7.5 (default, Mar  9 2014 22:15:05)
</span><span class='line'>SparkContext available as sc, HiveContext available as sqlContext.
</span></code></pre></td></tr></table></div></figure>


<p>There will be a considerable amount of other debugging and logging statements than that, but for the point of this those will do as things to look for.</p>

<h2>Spark Context</h2>

<p>In the Spark shell, we&rsquo;re given the entirety of the Spark library as <code>sc</code> to interact with. We can use that to load in our text file:</p>

<figure class='code'><figcaption><span>scala-wordlist</span></figcaption><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
</pre></td><td class='code'><pre><code class='scala'><span class='line'><span class="n">scala</span><span class="o">&gt;</span> <span class="k">val</span> <span class="n">wordList</span> <span class="k">=</span> <span class="n">sc</span><span class="o">.</span><span class="n">textFile</span><span class="o">(</span><span class="s">&quot;/Users/lemur/dev/wordlist/wordsEn.txt&quot;</span><span class="o">)</span>
</span><span class='line'><span class="c1">// ...debugger output</span>
</span></code></pre></td></tr></table></div></figure>




<figure class='code'><figcaption><span>python-wordlist</span></figcaption><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
</pre></td><td class='code'><pre><code class='python'><span class='line'><span class="o">&gt;&gt;&gt;</span> <span class="n">wordList</span> <span class="o">=</span> <span class="n">sc</span><span class="o">.</span><span class="n">textFile</span><span class="p">(</span><span class="s">&quot;/Users/lemur/dev/wordlist/wordsEn.txt&quot;</span><span class="p">)</span>
</span><span class='line'><span class="c"># ...debugger output</span>
</span></code></pre></td></tr></table></div></figure>


<p><strong>WARNING</strong> - Remember last time when I mentioned Spark was Lazy? If you type that path in wrong, it&rsquo;s not going to tell you anything until you try and run commands on it. This is the same for a lot of functions in Spark, you won&rsquo;t know it&rsquo;s broken until you run it.</p>

<p>Now we have our files loaded into memory to do some experimentation with as RDDs (Resilient Distributed Datasets), Spark&rsquo;s abstraction for distributed data.</p>

<p>Let&rsquo;s try a basic one to start, how many lines are in the file? (I&rsquo;m going to be trimming output so we don&rsquo;t fill the page with debugger info)</p>

<figure class='code'><figcaption><span>scala-wordlist-count</span></figcaption><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
<span class='line-number'>3</span>
</pre></td><td class='code'><pre><code class='scala'><span class='line'><span class="n">scala</span><span class="o">&gt;</span> <span class="n">wordList</span><span class="o">.</span><span class="n">count</span><span class="o">()</span>
</span><span class='line'><span class="c1">// ...debugger output</span>
</span><span class='line'><span class="n">res0</span><span class="k">:</span> <span class="kt">Long</span> <span class="o">=</span> <span class="mi">109583</span>
</span></code></pre></td></tr></table></div></figure>




<figure class='code'><figcaption><span>python-wordlist-count</span></figcaption><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
<span class='line-number'>3</span>
</pre></td><td class='code'><pre><code class='python'><span class='line'><span class="o">&gt;&gt;&gt;</span> <span class="n">wordList</span><span class="o">.</span><span class="n">count</span><span class="p">()</span>
</span><span class='line'><span class="c"># ...debugger output</span>
</span><span class='line'><span class="mi">109583</span>
</span></code></pre></td></tr></table></div></figure>


<p>With that you&rsquo;ve just run a Spark job. Simple as that, and not much different than how you&rsquo;d interact with anything else.</p>

<h2>Starts with</h2>

<p>Now, since this is a dictionary, each word is in there once. That makes a wordcount a bit pointless, so instead let&rsquo;s get a list of what letters they start with:</p>

<figure class='code'><figcaption><span>starts-with-scala</span></figcaption><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
<span class='line-number'>3</span>
<span class='line-number'>4</span>
<span class='line-number'>5</span>
<span class='line-number'>6</span>
<span class='line-number'>7</span>
<span class='line-number'>8</span>
<span class='line-number'>9</span>
<span class='line-number'>10</span>
<span class='line-number'>11</span>
<span class='line-number'>12</span>
<span class='line-number'>13</span>
<span class='line-number'>14</span>
<span class='line-number'>15</span>
<span class='line-number'>16</span>
<span class='line-number'>17</span>
<span class='line-number'>18</span>
<span class='line-number'>19</span>
<span class='line-number'>20</span>
<span class='line-number'>21</span>
<span class='line-number'>22</span>
<span class='line-number'>23</span>
<span class='line-number'>24</span>
<span class='line-number'>25</span>
<span class='line-number'>26</span>
<span class='line-number'>27</span>
<span class='line-number'>28</span>
</pre></td><td class='code'><pre><code class='scala'><span class='line'><span class="n">scala</span><span class="o">&gt;</span> <span class="n">wordList</span><span class="o">.</span><span class="n">filter</span><span class="o">(</span><span class="k">_</span> <span class="o">!=</span> <span class="s">&quot;&quot;</span><span class="o">).</span><span class="n">map</span><span class="o">(</span><span class="n">word</span> <span class="k">=&gt;</span> <span class="o">(</span><span class="n">word</span><span class="o">(</span><span class="mi">0</span><span class="o">),</span> <span class="mi">1</span><span class="o">)).</span><span class="n">reduceByKey</span><span class="o">(</span><span class="k">_</span><span class="o">+</span><span class="k">_</span><span class="o">).</span><span class="n">foreach</span><span class="o">(</span><span class="n">println</span><span class="o">)</span>
</span><span class='line'>
</span><span class='line'><span class="o">(</span><span class="n">w</span><span class="o">,</span><span class="mi">2714</span><span class="o">)</span>
</span><span class='line'><span class="o">(</span><span class="n">s</span><span class="o">,</span><span class="mi">12108</span><span class="o">)</span>
</span><span class='line'><span class="o">(</span><span class="n">e</span><span class="o">,</span><span class="mi">4494</span><span class="o">)</span>
</span><span class='line'><span class="o">(</span><span class="n">a</span><span class="o">,</span><span class="mi">6541</span><span class="o">)</span>
</span><span class='line'><span class="o">(</span><span class="n">k</span><span class="o">,</span><span class="mi">964</span><span class="o">)</span>
</span><span class='line'><span class="o">(</span><span class="n">i</span><span class="o">,</span><span class="mi">4382</span><span class="o">)</span>
</span><span class='line'><span class="o">(</span><span class="n">y</span><span class="o">,</span><span class="mi">370</span><span class="o">)</span>
</span><span class='line'><span class="o">(</span><span class="n">u</span><span class="o">,</span><span class="mi">3312</span><span class="o">)</span>
</span><span class='line'><span class="o">(</span><span class="n">o</span><span class="o">,</span><span class="mi">2966</span><span class="o">)</span>
</span><span class='line'><span class="o">(</span><span class="n">q</span><span class="o">,</span><span class="mi">577</span><span class="o">)</span>
</span><span class='line'><span class="o">(</span><span class="n">g</span><span class="o">,</span><span class="mi">3594</span><span class="o">)</span>
</span><span class='line'><span class="o">(</span><span class="n">d</span><span class="o">,</span><span class="mi">6694</span><span class="o">)</span>
</span><span class='line'><span class="o">(</span><span class="n">z</span><span class="o">,</span><span class="mi">265</span><span class="o">)</span>
</span><span class='line'><span class="o">(</span><span class="n">m</span><span class="o">,</span><span class="mi">5806</span><span class="o">)</span>
</span><span class='line'><span class="o">(</span><span class="n">c</span><span class="o">,</span><span class="mi">10324</span><span class="o">)</span>
</span><span class='line'><span class="o">(</span><span class="n">p</span><span class="o">,</span><span class="mi">8448</span><span class="o">)</span>
</span><span class='line'><span class="o">(</span><span class="n">x</span><span class="o">,</span><span class="mi">79</span><span class="o">)</span>
</span><span class='line'><span class="o">(</span><span class="n">t</span><span class="o">,</span><span class="mi">5530</span><span class="o">)</span>
</span><span class='line'><span class="o">(</span><span class="n">b</span><span class="o">,</span><span class="mi">6280</span><span class="o">)</span>
</span><span class='line'><span class="o">(</span><span class="n">h</span><span class="o">,</span><span class="mi">3920</span><span class="o">)</span>
</span><span class='line'><span class="o">(</span><span class="n">n</span><span class="o">,</span><span class="mi">2475</span><span class="o">)</span>
</span><span class='line'><span class="o">(</span><span class="n">f</span><span class="o">,</span><span class="mi">4701</span><span class="o">)</span>
</span><span class='line'><span class="o">(</span><span class="n">j</span><span class="o">,</span><span class="mi">1046</span><span class="o">)</span>
</span><span class='line'><span class="o">(</span><span class="n">v</span><span class="o">,</span><span class="mi">1825</span><span class="o">)</span>
</span><span class='line'><span class="o">(</span><span class="n">r</span><span class="o">,</span><span class="mi">6804</span><span class="o">)</span>
</span><span class='line'><span class="o">(</span><span class="n">l</span><span class="o">,</span><span class="mi">3363</span><span class="o">)</span>
</span></code></pre></td></tr></table></div></figure>




<figure class='code'><figcaption><span>starts-with-python</span></figcaption><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
<span class='line-number'>3</span>
<span class='line-number'>4</span>
<span class='line-number'>5</span>
<span class='line-number'>6</span>
<span class='line-number'>7</span>
<span class='line-number'>8</span>
<span class='line-number'>9</span>
<span class='line-number'>10</span>
<span class='line-number'>11</span>
<span class='line-number'>12</span>
<span class='line-number'>13</span>
<span class='line-number'>14</span>
<span class='line-number'>15</span>
<span class='line-number'>16</span>
<span class='line-number'>17</span>
<span class='line-number'>18</span>
<span class='line-number'>19</span>
<span class='line-number'>20</span>
<span class='line-number'>21</span>
<span class='line-number'>22</span>
<span class='line-number'>23</span>
<span class='line-number'>24</span>
<span class='line-number'>25</span>
<span class='line-number'>26</span>
<span class='line-number'>27</span>
<span class='line-number'>28</span>
<span class='line-number'>29</span>
<span class='line-number'>30</span>
<span class='line-number'>31</span>
<span class='line-number'>32</span>
<span class='line-number'>33</span>
<span class='line-number'>34</span>
<span class='line-number'>35</span>
</pre></td><td class='code'><pre><code class='python'><span class='line'><span class="n">letterCounts</span> <span class="o">=</span> <span class="n">wordList</span> \
</span><span class='line'>  <span class="o">.</span><span class="n">filter</span><span class="p">(</span><span class="k">lambda</span> <span class="n">w</span><span class="p">:</span> <span class="n">w</span> <span class="o">!=</span> <span class="s">&quot;&quot;</span><span class="p">)</span> \
</span><span class='line'>  <span class="o">.</span><span class="n">map</span><span class="p">(</span><span class="k">lambda</span> <span class="n">w</span><span class="p">:</span> <span class="p">(</span><span class="n">w</span><span class="p">[</span><span class="mi">0</span><span class="p">],</span> <span class="mi">1</span><span class="p">))</span> \
</span><span class='line'>  <span class="o">.</span><span class="n">reduceByKey</span><span class="p">(</span><span class="k">lambda</span> <span class="n">a</span><span class="p">,</span><span class="n">b</span><span class="p">:</span> <span class="n">a</span> <span class="o">+</span> <span class="n">b</span><span class="p">)</span> \
</span><span class='line'>  <span class="o">.</span><span class="n">collect</span><span class="p">()</span> <span class="c"># Force the result to run</span>
</span><span class='line'>
</span><span class='line'><span class="o">&gt;&gt;&gt;</span> <span class="k">for</span> <span class="n">count</span> <span class="ow">in</span> <span class="n">letterCounts</span><span class="p">:</span>
</span><span class='line'><span class="o">...</span>   <span class="k">print</span> <span class="n">count</span>
</span><span class='line'><span class="o">...</span>
</span><span class='line'><span class="p">(</span><span class="s">u&#39;a&#39;</span><span class="p">,</span> <span class="mi">6541</span><span class="p">)</span>
</span><span class='line'><span class="p">(</span><span class="s">u&#39;c&#39;</span><span class="p">,</span> <span class="mi">10324</span><span class="p">)</span>
</span><span class='line'><span class="p">(</span><span class="s">u&#39;e&#39;</span><span class="p">,</span> <span class="mi">4494</span><span class="p">)</span>
</span><span class='line'><span class="p">(</span><span class="s">u&#39;g&#39;</span><span class="p">,</span> <span class="mi">3594</span><span class="p">)</span>
</span><span class='line'><span class="p">(</span><span class="s">u&#39;i&#39;</span><span class="p">,</span> <span class="mi">4382</span><span class="p">)</span>
</span><span class='line'><span class="p">(</span><span class="s">u&#39;k&#39;</span><span class="p">,</span> <span class="mi">964</span><span class="p">)</span>
</span><span class='line'><span class="p">(</span><span class="s">u&#39;m&#39;</span><span class="p">,</span> <span class="mi">5806</span><span class="p">)</span>
</span><span class='line'><span class="p">(</span><span class="s">u&#39;o&#39;</span><span class="p">,</span> <span class="mi">2966</span><span class="p">)</span>
</span><span class='line'><span class="p">(</span><span class="s">u&#39;q&#39;</span><span class="p">,</span> <span class="mi">577</span><span class="p">)</span>
</span><span class='line'><span class="p">(</span><span class="s">u&#39;s&#39;</span><span class="p">,</span> <span class="mi">12108</span><span class="p">)</span>
</span><span class='line'><span class="p">(</span><span class="s">u&#39;u&#39;</span><span class="p">,</span> <span class="mi">3312</span><span class="p">)</span>
</span><span class='line'><span class="p">(</span><span class="s">u&#39;w&#39;</span><span class="p">,</span> <span class="mi">2714</span><span class="p">)</span>
</span><span class='line'><span class="p">(</span><span class="s">u&#39;y&#39;</span><span class="p">,</span> <span class="mi">370</span><span class="p">)</span>
</span><span class='line'><span class="p">(</span><span class="s">u&#39;b&#39;</span><span class="p">,</span> <span class="mi">6280</span><span class="p">)</span>
</span><span class='line'><span class="p">(</span><span class="s">u&#39;d&#39;</span><span class="p">,</span> <span class="mi">6694</span><span class="p">)</span>
</span><span class='line'><span class="p">(</span><span class="s">u&#39;f&#39;</span><span class="p">,</span> <span class="mi">4701</span><span class="p">)</span>
</span><span class='line'><span class="p">(</span><span class="s">u&#39;h&#39;</span><span class="p">,</span> <span class="mi">3920</span><span class="p">)</span>
</span><span class='line'><span class="p">(</span><span class="s">u&#39;j&#39;</span><span class="p">,</span> <span class="mi">1046</span><span class="p">)</span>
</span><span class='line'><span class="p">(</span><span class="s">u&#39;l&#39;</span><span class="p">,</span> <span class="mi">3363</span><span class="p">)</span>
</span><span class='line'><span class="p">(</span><span class="s">u&#39;n&#39;</span><span class="p">,</span> <span class="mi">2475</span><span class="p">)</span>
</span><span class='line'><span class="p">(</span><span class="s">u&#39;p&#39;</span><span class="p">,</span> <span class="mi">8448</span><span class="p">)</span>
</span><span class='line'><span class="p">(</span><span class="s">u&#39;r&#39;</span><span class="p">,</span> <span class="mi">6804</span><span class="p">)</span>
</span><span class='line'><span class="p">(</span><span class="s">u&#39;t&#39;</span><span class="p">,</span> <span class="mi">5530</span><span class="p">)</span>
</span><span class='line'><span class="p">(</span><span class="s">u&#39;v&#39;</span><span class="p">,</span> <span class="mi">1825</span><span class="p">)</span>
</span><span class='line'><span class="p">(</span><span class="s">u&#39;x&#39;</span><span class="p">,</span> <span class="mi">79</span><span class="p">)</span>
</span><span class='line'><span class="p">(</span><span class="s">u&#39;z&#39;</span><span class="p">,</span> <span class="mi">265</span><span class="p">)</span>
</span></code></pre></td></tr></table></div></figure>


<h2>Spark SQL</h2>

<p>On occasion we&rsquo;ll have the niceties of structured data such as JSON, and Spark has just the way to deal with it using Spark SQL.</p>

<p><strong>WARNING</strong> - Spark guide has been quoted as saying:</p>

<blockquote><p>Note that the file that is offered as a json file is not a typical JSON file. Each line must contain a separate, self-contained valid JSON object. As a consequence, a regular multi-line JSON file will most often fail.</p></blockquote>

<p>&hellip;and it will crash if you pass it actually valid JSON. If any reader knows the reasoning behind this particularly confounding piece of work, I&rsquo;d love to know.</p>

<p>We&rsquo;ll be using fake people data: <a href="people.json">https://gist.githubusercontent.com/baweaver/b6460bb96feff1faeb78/raw/4c9b46be165725d041ff47bdc042c6a4880c1877/people.json</a> (right click to save)</p>

<p>Let&rsquo;s go ahead and load it up using the <code>sqlContext</code>:</p>

<figure class='code'><figcaption><span>scala-sql-load</span></figcaption><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
<span class='line-number'>3</span>
<span class='line-number'>4</span>
<span class='line-number'>5</span>
</pre></td><td class='code'><pre><code class='scala'><span class='line'><span class="n">scala</span><span class="o">&gt;</span> <span class="k">val</span> <span class="n">people</span> <span class="k">=</span> <span class="n">sqlContext</span><span class="o">.</span><span class="n">jsonFile</span><span class="o">(</span><span class="s">&quot;/Users/lemur/dev/wordlist/people.json&quot;</span><span class="o">)</span>
</span><span class='line'><span class="n">people</span><span class="k">:</span> <span class="kt">org.apache.spark.sql.DataFrame</span> <span class="o">=</span> <span class="o">[</span><span class="k">_</span><span class="kt">id:</span> <span class="kt">string</span>, <span class="kt">address:</span> <span class="kt">string</span>, <span class="kt">age:</span> <span class="kt">bigint</span>, <span class="kt">balance:</span> <span class="kt">double</span>, <span class="kt">company:</span> <span class="kt">string</span>, <span class="kt">email:</span> <span class="kt">string</span>, <span class="kt">eyeColor:</span> <span class="kt">string</span>, <span class="kt">gender:</span> <span class="kt">string</span>, <span class="kt">guid:</span> <span class="kt">string</span>, <span class="kt">index:</span> <span class="kt">bigint</span>, <span class="kt">isActive:</span> <span class="kt">boolean</span>, <span class="kt">latitude:</span> <span class="kt">double</span>, <span class="kt">longitude:</span> <span class="kt">double</span>, <span class="kt">name:</span> <span class="kt">string</span>, <span class="kt">phone:</span> <span class="kt">string</span>, <span class="kt">picture:</span> <span class="kt">string</span>, <span class="kt">registered:</span> <span class="kt">string</span><span class="o">]</span>
</span><span class='line'>
</span><span class='line'><span class="c1">// Make SURE to register it as a table</span>
</span><span class='line'><span class="n">scala</span><span class="o">&gt;</span> <span class="n">people</span><span class="o">.</span><span class="n">registerTempTable</span><span class="o">(</span><span class="s">&quot;people&quot;</span><span class="o">)</span>
</span></code></pre></td></tr></table></div></figure>




<figure class='code'><figcaption><span>python-sql-load</span></figcaption><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
<span class='line-number'>3</span>
<span class='line-number'>4</span>
<span class='line-number'>5</span>
<span class='line-number'>6</span>
<span class='line-number'>7</span>
</pre></td><td class='code'><pre><code class='python'><span class='line'><span class="o">&gt;&gt;&gt;</span> <span class="n">people</span> <span class="o">=</span> <span class="n">sqlContext</span><span class="o">.</span><span class="n">jsonFile</span><span class="p">(</span><span class="s">&quot;/Users/lemur/dev/wordlist/people.json&quot;</span><span class="p">)</span>
</span><span class='line'>
</span><span class='line'><span class="o">&gt;&gt;&gt;</span> <span class="n">people</span>
</span><span class='line'><span class="n">DataFrame</span><span class="p">[</span><span class="n">_id</span><span class="p">:</span> <span class="n">string</span><span class="p">,</span> <span class="n">address</span><span class="p">:</span> <span class="n">string</span><span class="p">,</span> <span class="n">age</span><span class="p">:</span> <span class="n">bigint</span><span class="p">,</span> <span class="n">balance</span><span class="p">:</span> <span class="n">double</span><span class="p">,</span> <span class="n">company</span><span class="p">:</span> <span class="n">string</span><span class="p">,</span> <span class="n">email</span><span class="p">:</span> <span class="n">string</span><span class="p">,</span> <span class="n">eyeColor</span><span class="p">:</span> <span class="n">string</span><span class="p">,</span> <span class="n">gender</span><span class="p">:</span> <span class="n">string</span><span class="p">,</span> <span class="n">guid</span><span class="p">:</span> <span class="n">string</span><span class="p">,</span> <span class="n">index</span><span class="p">:</span> <span class="n">bigint</span><span class="p">,</span> <span class="n">isActive</span><span class="p">:</span> <span class="n">boolean</span><span class="p">,</span> <span class="n">latitude</span><span class="p">:</span> <span class="n">double</span><span class="p">,</span> <span class="n">longitude</span><span class="p">:</span> <span class="n">double</span><span class="p">,</span> <span class="n">name</span><span class="p">:</span> <span class="n">string</span><span class="p">,</span> <span class="n">phone</span><span class="p">:</span> <span class="n">string</span><span class="p">,</span> <span class="n">picture</span><span class="p">:</span> <span class="n">string</span><span class="p">,</span> <span class="n">registered</span><span class="p">:</span> <span class="n">string</span><span class="p">]</span>
</span><span class='line'>
</span><span class='line'><span class="c"># Make SURE to register it as a table</span>
</span><span class='line'><span class="o">&gt;&gt;&gt;</span> <span class="n">people</span><span class="o">.</span><span class="n">registerTempTable</span><span class="p">(</span><span class="s">&quot;people&quot;</span><span class="p">)</span>
</span></code></pre></td></tr></table></div></figure>


<p>Let&rsquo;s start with something fairly basic on the SQL, getting the index of people who are inactive with a balance greater than $2000:</p>

<figure class='code'><figcaption><span>scala-sql-basic</span></figcaption><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
<span class='line-number'>3</span>
<span class='line-number'>4</span>
<span class='line-number'>5</span>
<span class='line-number'>6</span>
<span class='line-number'>7</span>
<span class='line-number'>8</span>
<span class='line-number'>9</span>
</pre></td><td class='code'><pre><code class='scala'><span class='line'><span class="c1">// Note I&#39;m calling on SQL Context here</span>
</span><span class='line'><span class="n">scala</span><span class="o">&gt;</span> <span class="n">sqlContext</span><span class="o">.</span><span class="n">sql</span><span class="o">(</span><span class="s">&quot;&quot;&quot;</span>
</span><span class='line'><span class="s">     |   SELECT index</span>
</span><span class='line'><span class="s">     |   FROM people</span>
</span><span class='line'><span class="s">     |   WHERE isActive == false AND</span>
</span><span class='line'><span class="s">     |         balance &gt; 2000.00</span>
</span><span class='line'><span class="s">     | &quot;&quot;&quot;</span><span class="o">).</span><span class="n">count</span><span class="o">()</span>
</span><span class='line'>
</span><span class='line'><span class="n">res1</span><span class="k">:</span> <span class="kt">Long</span> <span class="o">=</span> <span class="mi">75</span>
</span></code></pre></td></tr></table></div></figure>




<figure class='code'><figcaption><span>python-sql-basic</span></figcaption><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
<span class='line-number'>3</span>
<span class='line-number'>4</span>
<span class='line-number'>5</span>
<span class='line-number'>6</span>
<span class='line-number'>7</span>
<span class='line-number'>8</span>
</pre></td><td class='code'><pre><code class='python'><span class='line'><span class="o">&gt;&gt;&gt;</span> <span class="n">sqlContext</span><span class="o">.</span><span class="n">sql</span><span class="p">(</span><span class="s">&quot;&quot;&quot;</span>
</span><span class='line'><span class="s">...   SELECT index</span>
</span><span class='line'><span class="s">...   FROM people</span>
</span><span class='line'><span class="s">...   WHERE isActive == false AND</span>
</span><span class='line'><span class="s">...         balance &gt; 2000.00</span>
</span><span class='line'><span class="s">... &quot;&quot;&quot;</span><span class="p">)</span><span class="o">.</span><span class="n">count</span><span class="p">()</span>
</span><span class='line'>
</span><span class='line'><span class="mi">75</span>
</span></code></pre></td></tr></table></div></figure>


<p>Triple quotes are a life saver when making larger SQL-like strings.</p>

<p>Like SQL, you can join, count, group, and various other operations all in a big data context. It&rsquo;s a shame it won&rsquo;t play nicely with actual JSON, but the features are handy nonetheless.</p>

<p><a href="https://spark.apache.org/docs/1.4.0/sql-programming-guide.html#starting-point-sqlcontext">Further reading</a></p>

<h2>Spark MLLib - Statistics</h2>

<p>Spark even comes with its own Machine Learning libraries, but for the sake of brevity we&rsquo;re only going to look into some of the basic statistical options. Later tutorials will address this in some depth.</p>

<p>We&rsquo;ll be looking into the column stats of our wordList from earlier:</p>

<figure class='code'><figcaption><span>scala-statistics-basics</span></figcaption><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
<span class='line-number'>3</span>
<span class='line-number'>4</span>
<span class='line-number'>5</span>
<span class='line-number'>6</span>
<span class='line-number'>7</span>
<span class='line-number'>8</span>
<span class='line-number'>9</span>
<span class='line-number'>10</span>
<span class='line-number'>11</span>
<span class='line-number'>12</span>
<span class='line-number'>13</span>
<span class='line-number'>14</span>
<span class='line-number'>15</span>
<span class='line-number'>16</span>
<span class='line-number'>17</span>
<span class='line-number'>18</span>
<span class='line-number'>19</span>
<span class='line-number'>20</span>
<span class='line-number'>21</span>
<span class='line-number'>22</span>
<span class='line-number'>23</span>
<span class='line-number'>24</span>
</pre></td><td class='code'><pre><code class='scala'><span class='line'><span class="c1">// Make SURE to import it</span>
</span><span class='line'><span class="n">scala</span><span class="o">&gt;</span> <span class="k">import</span> <span class="nn">org.apache.spark.mllib.stat.Statistics</span>
</span><span class='line'><span class="n">scala</span><span class="o">&gt;</span> <span class="k">import</span> <span class="nn">org.apache.spark.mllib.linalg.Vectors</span>
</span><span class='line'>
</span><span class='line'><span class="n">scala</span><span class="o">&gt;</span> <span class="k">val</span> <span class="n">wordList</span> <span class="k">=</span> <span class="n">sc</span><span class="o">.</span><span class="n">textFile</span><span class="o">(</span><span class="s">&quot;/Users/lemur/dev/wordlist/wordsEn.txt&quot;</span><span class="o">)</span>
</span><span class='line'>
</span><span class='line'><span class="n">scala</span><span class="o">&gt;</span> <span class="k">val</span> <span class="n">wordLengths</span> <span class="k">=</span> <span class="n">wordList</span><span class="o">.</span><span class="n">map</span><span class="o">(</span><span class="n">w</span> <span class="k">=&gt;</span> <span class="nc">Vectors</span><span class="o">.</span><span class="n">dense</span><span class="o">(</span><span class="n">w</span><span class="o">.</span><span class="n">length</span><span class="o">))</span>
</span><span class='line'><span class="n">wordLengths</span><span class="k">:</span> <span class="kt">org.apache.spark.rdd.RDD</span><span class="o">[</span><span class="kt">org.apache.spark.mllib.linalg.Vector</span><span class="o">]</span> <span class="k">=</span> <span class="nc">MapPartitionsRDD</span><span class="o">[</span><span class="err">6</span><span class="o">]</span> <span class="n">at</span> <span class="n">map</span> <span class="n">at</span> <span class="o">&lt;</span><span class="n">console</span><span class="k">&gt;:</span><span class="mi">32</span>
</span><span class='line'>
</span><span class='line'><span class="n">scala</span><span class="o">&gt;</span> <span class="k">val</span> <span class="n">summaryStatistics</span> <span class="k">=</span> <span class="nc">Statistics</span><span class="o">.</span><span class="n">colStats</span><span class="o">(</span><span class="n">wordLengths</span><span class="o">)</span>
</span><span class='line'><span class="n">summaryStatistics</span><span class="k">:</span> <span class="kt">org.apache.spark.mllib.stat.MultivariateStatisticalSummary</span> <span class="o">=</span> <span class="n">org</span><span class="o">.</span><span class="n">apache</span><span class="o">.</span><span class="n">spark</span><span class="o">.</span><span class="n">mllib</span><span class="o">.</span><span class="n">stat</span><span class="o">.</span><span class="nc">MultivariateOnlineSummarizer</span><span class="k">@</span><span class="mi">4377</span><span class="n">e40a</span>
</span><span class='line'>
</span><span class='line'><span class="c1">// Let&#39;s take a look inside shall we?</span>
</span><span class='line'><span class="n">scala</span><span class="o">&gt;</span> <span class="n">summaryStatistics</span><span class="o">.</span><span class="n">mean</span>
</span><span class='line'><span class="n">res22</span><span class="k">:</span> <span class="kt">org.apache.spark.mllib.linalg.Vector</span> <span class="o">=</span> <span class="o">[</span><span class="err">8</span><span class="kt">.</span><span class="err">533905806557591</span><span class="o">]</span>
</span><span class='line'>
</span><span class='line'><span class="n">scala</span><span class="o">&gt;</span> <span class="n">summaryStatistics</span><span class="o">.</span><span class="n">max</span>
</span><span class='line'><span class="n">res23</span><span class="k">:</span> <span class="kt">org.apache.spark.mllib.linalg.Vector</span> <span class="o">=</span> <span class="o">[</span><span class="err">28</span><span class="kt">.</span><span class="err">0</span><span class="o">]</span>
</span><span class='line'>
</span><span class='line'><span class="n">scala</span><span class="o">&gt;</span> <span class="n">summaryStatistics</span><span class="o">.</span><span class="n">min</span>
</span><span class='line'><span class="n">res24</span><span class="k">:</span> <span class="kt">org.apache.spark.mllib.linalg.Vector</span> <span class="o">=</span> <span class="o">[</span><span class="err">0</span><span class="kt">.</span><span class="err">0</span><span class="o">]</span>
</span><span class='line'>
</span><span class='line'><span class="n">scala</span><span class="o">&gt;</span> <span class="n">summaryStatistics</span><span class="o">.</span><span class="n">variance</span>
</span><span class='line'><span class="n">res25</span><span class="k">:</span> <span class="kt">org.apache.spark.mllib.linalg.Vector</span> <span class="o">=</span> <span class="o">[</span><span class="err">6</span><span class="kt">.</span><span class="err">448337984119102</span><span class="o">]</span>
</span></code></pre></td></tr></table></div></figure>




<figure class='code'><figcaption><span>python-statistics-basics</span></figcaption><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
<span class='line-number'>3</span>
<span class='line-number'>4</span>
<span class='line-number'>5</span>
<span class='line-number'>6</span>
<span class='line-number'>7</span>
<span class='line-number'>8</span>
<span class='line-number'>9</span>
<span class='line-number'>10</span>
<span class='line-number'>11</span>
<span class='line-number'>12</span>
<span class='line-number'>13</span>
<span class='line-number'>14</span>
<span class='line-number'>15</span>
<span class='line-number'>16</span>
<span class='line-number'>17</span>
<span class='line-number'>18</span>
<span class='line-number'>19</span>
<span class='line-number'>20</span>
<span class='line-number'>21</span>
<span class='line-number'>22</span>
</pre></td><td class='code'><pre><code class='python'><span class='line'><span class="c"># Make SURE to import it</span>
</span><span class='line'><span class="o">&gt;&gt;&gt;</span> <span class="kn">from</span> <span class="nn">pyspark.mllib.stat</span> <span class="kn">import</span> <span class="n">Statistics</span>
</span><span class='line'>
</span><span class='line'><span class="o">&gt;&gt;&gt;</span> <span class="n">wordList</span> <span class="o">=</span> <span class="n">sc</span><span class="o">.</span><span class="n">textFile</span><span class="p">(</span><span class="s">&quot;/Users/lemur/dev/wordlist/wordsEn.txt&quot;</span><span class="p">)</span>
</span><span class='line'>
</span><span class='line'><span class="c"># Python will take a standard list in</span>
</span><span class='line'><span class="o">&gt;&gt;&gt;</span> <span class="n">wordLengths</span> <span class="o">=</span> <span class="n">wordList</span><span class="o">.</span><span class="n">map</span><span class="p">(</span><span class="k">lambda</span> <span class="n">w</span><span class="p">:</span> <span class="p">[</span><span class="nb">len</span><span class="p">(</span><span class="n">w</span><span class="p">)])</span>
</span><span class='line'>
</span><span class='line'><span class="o">&gt;&gt;&gt;</span> <span class="n">summaryStatistics</span> <span class="o">=</span> <span class="n">Statistics</span><span class="o">.</span><span class="n">colStats</span><span class="p">(</span><span class="n">wordLengths</span><span class="p">)</span>
</span><span class='line'>
</span><span class='line'><span class="c"># Let&#39;s take a look inside shall we?</span>
</span><span class='line'><span class="o">&gt;&gt;&gt;</span> <span class="n">summaryStatistics</span><span class="o">.</span><span class="n">mean</span><span class="p">()</span>
</span><span class='line'><span class="n">array</span><span class="p">([</span> <span class="mf">8.53390581</span><span class="p">])</span>
</span><span class='line'>
</span><span class='line'><span class="o">&gt;&gt;&gt;</span> <span class="n">summaryStatistics</span><span class="o">.</span><span class="n">max</span><span class="p">()</span>
</span><span class='line'><span class="n">array</span><span class="p">([</span> <span class="mf">28.</span><span class="p">])</span>
</span><span class='line'>
</span><span class='line'><span class="o">&gt;&gt;&gt;</span> <span class="n">summaryStatistics</span><span class="o">.</span><span class="n">min</span><span class="p">()</span>
</span><span class='line'><span class="n">array</span><span class="p">([</span> <span class="mf">0.</span><span class="p">])</span>
</span><span class='line'>
</span><span class='line'><span class="o">&gt;&gt;&gt;</span> <span class="n">summaryStatistics</span><span class="o">.</span><span class="n">variance</span><span class="p">()</span>
</span><span class='line'><span class="n">array</span><span class="p">([</span> <span class="mf">6.44833798</span><span class="p">])</span>
</span></code></pre></td></tr></table></div></figure>


<p><a href="https://spark.apache.org/docs/1.4.0/mllib-statistics.html">Further reading</a></p>

<h2>Wrapping Up</h2>

<p>We&rsquo;ve taken a cursory look at some of the features and basic operations of Spark. Here&rsquo;s the question though, what do you as readers want to know more about? Vote on Strawpoll to let me know: <a href="http://strawpoll.me/4701594">http://strawpoll.me/4701594</a></p>

<p>Think of it as a choose your own adventure of sorts. I&rsquo;ll be writing about all of the above in more detail, but in the order you want to see it happen.</p>
]]></content>
  </entry>
  
  <entry>
    <title type="html"><![CDATA[A Functional Programming Primer for Spark]]></title>
    <link href="http://www.baweaver.com/blog/2015/06/20/a-functional-programming-primer-for-spark/"/>
    <updated>2015-06-20T19:48:31-07:00</updated>
    <id>http://www.baweaver.com/blog/2015/06/20/a-functional-programming-primer-for-spark</id>
    <content type="html"><![CDATA[<p>There&rsquo;s a lot of hype around Spark and Big Data in general, especially around the concepts of Functional Programming. Problem is, Functional Programming is a tall order for a standard Java programmer.</p>

<p>The goal of this post is to get you up to speed in the very basics of Functional Programming as they&rsquo;ll later relate to Spark. I&rsquo;ll be primarily covering Scala with Python alternate versions.</p>

<!-- more -->


<h2>What about Java 8?</h2>

<p>I do not intend to cover Java in this tutorial or any other. MapReduce is a concept based in Functional Programming, and you would be doing yourself a great disservice by trying to shoehorn Java into that role, including Java 8.</p>

<p>You might wonder how bad it could possibly be, perhaps I&rsquo;m just biased. I would direct you to look at the <a href="http://hadoop.apache.org/docs/current/hadoop-mapreduce-client/hadoop-mapreduce-client-core/MapReduceTutorial.html#Example:_WordCount_v2.0">Hadoop word count example</a> and see the horrors of allowing Java patterns and card carrying GoF members to pretend they&rsquo;re programming functionally:</p>

<figure class='code'><figcaption><span>hadoop-wordcount-example</span></figcaption><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
<span class='line-number'>3</span>
<span class='line-number'>4</span>
<span class='line-number'>5</span>
<span class='line-number'>6</span>
<span class='line-number'>7</span>
<span class='line-number'>8</span>
<span class='line-number'>9</span>
<span class='line-number'>10</span>
<span class='line-number'>11</span>
<span class='line-number'>12</span>
<span class='line-number'>13</span>
<span class='line-number'>14</span>
<span class='line-number'>15</span>
<span class='line-number'>16</span>
<span class='line-number'>17</span>
<span class='line-number'>18</span>
<span class='line-number'>19</span>
<span class='line-number'>20</span>
<span class='line-number'>21</span>
<span class='line-number'>22</span>
<span class='line-number'>23</span>
<span class='line-number'>24</span>
<span class='line-number'>25</span>
<span class='line-number'>26</span>
<span class='line-number'>27</span>
<span class='line-number'>28</span>
<span class='line-number'>29</span>
<span class='line-number'>30</span>
<span class='line-number'>31</span>
<span class='line-number'>32</span>
<span class='line-number'>33</span>
<span class='line-number'>34</span>
<span class='line-number'>35</span>
<span class='line-number'>36</span>
<span class='line-number'>37</span>
<span class='line-number'>38</span>
<span class='line-number'>39</span>
<span class='line-number'>40</span>
<span class='line-number'>41</span>
<span class='line-number'>42</span>
<span class='line-number'>43</span>
<span class='line-number'>44</span>
<span class='line-number'>45</span>
<span class='line-number'>46</span>
<span class='line-number'>47</span>
<span class='line-number'>48</span>
<span class='line-number'>49</span>
<span class='line-number'>50</span>
<span class='line-number'>51</span>
<span class='line-number'>52</span>
<span class='line-number'>53</span>
<span class='line-number'>54</span>
<span class='line-number'>55</span>
<span class='line-number'>56</span>
<span class='line-number'>57</span>
<span class='line-number'>58</span>
<span class='line-number'>59</span>
<span class='line-number'>60</span>
<span class='line-number'>61</span>
<span class='line-number'>62</span>
<span class='line-number'>63</span>
<span class='line-number'>64</span>
<span class='line-number'>65</span>
<span class='line-number'>66</span>
<span class='line-number'>67</span>
<span class='line-number'>68</span>
<span class='line-number'>69</span>
<span class='line-number'>70</span>
<span class='line-number'>71</span>
<span class='line-number'>72</span>
<span class='line-number'>73</span>
<span class='line-number'>74</span>
<span class='line-number'>75</span>
<span class='line-number'>76</span>
<span class='line-number'>77</span>
<span class='line-number'>78</span>
<span class='line-number'>79</span>
<span class='line-number'>80</span>
<span class='line-number'>81</span>
<span class='line-number'>82</span>
<span class='line-number'>83</span>
<span class='line-number'>84</span>
<span class='line-number'>85</span>
<span class='line-number'>86</span>
<span class='line-number'>87</span>
<span class='line-number'>88</span>
<span class='line-number'>89</span>
<span class='line-number'>90</span>
<span class='line-number'>91</span>
<span class='line-number'>92</span>
<span class='line-number'>93</span>
<span class='line-number'>94</span>
<span class='line-number'>95</span>
<span class='line-number'>96</span>
<span class='line-number'>97</span>
<span class='line-number'>98</span>
<span class='line-number'>99</span>
<span class='line-number'>100</span>
<span class='line-number'>101</span>
<span class='line-number'>102</span>
<span class='line-number'>103</span>
<span class='line-number'>104</span>
<span class='line-number'>105</span>
<span class='line-number'>106</span>
<span class='line-number'>107</span>
<span class='line-number'>108</span>
<span class='line-number'>109</span>
<span class='line-number'>110</span>
<span class='line-number'>111</span>
<span class='line-number'>112</span>
<span class='line-number'>113</span>
<span class='line-number'>114</span>
<span class='line-number'>115</span>
<span class='line-number'>116</span>
<span class='line-number'>117</span>
<span class='line-number'>118</span>
<span class='line-number'>119</span>
<span class='line-number'>120</span>
<span class='line-number'>121</span>
<span class='line-number'>122</span>
<span class='line-number'>123</span>
<span class='line-number'>124</span>
<span class='line-number'>125</span>
<span class='line-number'>126</span>
<span class='line-number'>127</span>
<span class='line-number'>128</span>
<span class='line-number'>129</span>
<span class='line-number'>130</span>
<span class='line-number'>131</span>
<span class='line-number'>132</span>
<span class='line-number'>133</span>
</pre></td><td class='code'><pre><code class='java'><span class='line'><span class="kn">import</span> <span class="nn">java.io.BufferedReader</span><span class="o">;</span>
</span><span class='line'><span class="kn">import</span> <span class="nn">java.io.FileReader</span><span class="o">;</span>
</span><span class='line'><span class="kn">import</span> <span class="nn">java.io.IOException</span><span class="o">;</span>
</span><span class='line'><span class="kn">import</span> <span class="nn">java.net.URI</span><span class="o">;</span>
</span><span class='line'><span class="kn">import</span> <span class="nn">java.util.ArrayList</span><span class="o">;</span>
</span><span class='line'><span class="kn">import</span> <span class="nn">java.util.HashSet</span><span class="o">;</span>
</span><span class='line'><span class="kn">import</span> <span class="nn">java.util.List</span><span class="o">;</span>
</span><span class='line'><span class="kn">import</span> <span class="nn">java.util.Set</span><span class="o">;</span>
</span><span class='line'><span class="kn">import</span> <span class="nn">java.util.StringTokenizer</span><span class="o">;</span>
</span><span class='line'>
</span><span class='line'><span class="kn">import</span> <span class="nn">org.apache.hadoop.conf.Configuration</span><span class="o">;</span>
</span><span class='line'><span class="kn">import</span> <span class="nn">org.apache.hadoop.fs.Path</span><span class="o">;</span>
</span><span class='line'><span class="kn">import</span> <span class="nn">org.apache.hadoop.io.IntWritable</span><span class="o">;</span>
</span><span class='line'><span class="kn">import</span> <span class="nn">org.apache.hadoop.io.Text</span><span class="o">;</span>
</span><span class='line'><span class="kn">import</span> <span class="nn">org.apache.hadoop.mapreduce.Job</span><span class="o">;</span>
</span><span class='line'><span class="kn">import</span> <span class="nn">org.apache.hadoop.mapreduce.Mapper</span><span class="o">;</span>
</span><span class='line'><span class="kn">import</span> <span class="nn">org.apache.hadoop.mapreduce.Reducer</span><span class="o">;</span>
</span><span class='line'><span class="kn">import</span> <span class="nn">org.apache.hadoop.mapreduce.lib.input.FileInputFormat</span><span class="o">;</span>
</span><span class='line'><span class="kn">import</span> <span class="nn">org.apache.hadoop.mapreduce.lib.output.FileOutputFormat</span><span class="o">;</span>
</span><span class='line'><span class="kn">import</span> <span class="nn">org.apache.hadoop.mapreduce.Counter</span><span class="o">;</span>
</span><span class='line'><span class="kn">import</span> <span class="nn">org.apache.hadoop.util.GenericOptionsParser</span><span class="o">;</span>
</span><span class='line'><span class="kn">import</span> <span class="nn">org.apache.hadoop.util.StringUtils</span><span class="o">;</span>
</span><span class='line'>
</span><span class='line'><span class="kd">public</span> <span class="kd">class</span> <span class="nc">WordCount2</span> <span class="o">{</span>
</span><span class='line'>
</span><span class='line'>  <span class="kd">public</span> <span class="kd">static</span> <span class="kd">class</span> <span class="nc">TokenizerMapper</span>
</span><span class='line'>       <span class="kd">extends</span> <span class="n">Mapper</span><span class="o">&lt;</span><span class="n">Object</span><span class="o">,</span> <span class="n">Text</span><span class="o">,</span> <span class="n">Text</span><span class="o">,</span> <span class="n">IntWritable</span><span class="o">&gt;{</span>
</span><span class='line'>
</span><span class='line'>    <span class="kd">static</span> <span class="kd">enum</span> <span class="n">CountersEnum</span> <span class="o">{</span> <span class="n">INPUT_WORDS</span> <span class="o">}</span>
</span><span class='line'>
</span><span class='line'>    <span class="kd">private</span> <span class="kd">final</span> <span class="kd">static</span> <span class="n">IntWritable</span> <span class="n">one</span> <span class="o">=</span> <span class="k">new</span> <span class="nf">IntWritable</span><span class="o">(</span><span class="mi">1</span><span class="o">);</span>
</span><span class='line'>    <span class="kd">private</span> <span class="n">Text</span> <span class="n">word</span> <span class="o">=</span> <span class="k">new</span> <span class="nf">Text</span><span class="o">();</span>
</span><span class='line'>
</span><span class='line'>    <span class="kd">private</span> <span class="kt">boolean</span> <span class="n">caseSensitive</span><span class="o">;</span>
</span><span class='line'>    <span class="kd">private</span> <span class="n">Set</span><span class="o">&lt;</span><span class="n">String</span><span class="o">&gt;</span> <span class="n">patternsToSkip</span> <span class="o">=</span> <span class="k">new</span> <span class="n">HashSet</span><span class="o">&lt;</span><span class="n">String</span><span class="o">&gt;();</span>
</span><span class='line'>
</span><span class='line'>    <span class="kd">private</span> <span class="n">Configuration</span> <span class="n">conf</span><span class="o">;</span>
</span><span class='line'>    <span class="kd">private</span> <span class="n">BufferedReader</span> <span class="n">fis</span><span class="o">;</span>
</span><span class='line'>
</span><span class='line'>    <span class="nd">@Override</span>
</span><span class='line'>    <span class="kd">public</span> <span class="kt">void</span> <span class="nf">setup</span><span class="o">(</span><span class="n">Context</span> <span class="n">context</span><span class="o">)</span> <span class="kd">throws</span> <span class="n">IOException</span><span class="o">,</span>
</span><span class='line'>        <span class="n">InterruptedException</span> <span class="o">{</span>
</span><span class='line'>      <span class="n">conf</span> <span class="o">=</span> <span class="n">context</span><span class="o">.</span><span class="na">getConfiguration</span><span class="o">();</span>
</span><span class='line'>      <span class="n">caseSensitive</span> <span class="o">=</span> <span class="n">conf</span><span class="o">.</span><span class="na">getBoolean</span><span class="o">(</span><span class="s">&quot;wordcount.case.sensitive&quot;</span><span class="o">,</span> <span class="kc">true</span><span class="o">);</span>
</span><span class='line'>      <span class="k">if</span> <span class="o">(</span><span class="n">conf</span><span class="o">.</span><span class="na">getBoolean</span><span class="o">(</span><span class="s">&quot;wordcount.skip.patterns&quot;</span><span class="o">,</span> <span class="kc">true</span><span class="o">))</span> <span class="o">{</span>
</span><span class='line'>        <span class="n">URI</span><span class="o">[]</span> <span class="n">patternsURIs</span> <span class="o">=</span> <span class="n">Job</span><span class="o">.</span><span class="na">getInstance</span><span class="o">(</span><span class="n">conf</span><span class="o">).</span><span class="na">getCacheFiles</span><span class="o">();</span>
</span><span class='line'>        <span class="k">for</span> <span class="o">(</span><span class="n">URI</span> <span class="n">patternsURI</span> <span class="o">:</span> <span class="n">patternsURIs</span><span class="o">)</span> <span class="o">{</span>
</span><span class='line'>          <span class="n">Path</span> <span class="n">patternsPath</span> <span class="o">=</span> <span class="k">new</span> <span class="nf">Path</span><span class="o">(</span><span class="n">patternsURI</span><span class="o">.</span><span class="na">getPath</span><span class="o">());</span>
</span><span class='line'>          <span class="n">String</span> <span class="n">patternsFileName</span> <span class="o">=</span> <span class="n">patternsPath</span><span class="o">.</span><span class="na">getName</span><span class="o">().</span><span class="na">toString</span><span class="o">();</span>
</span><span class='line'>          <span class="n">parseSkipFile</span><span class="o">(</span><span class="n">patternsFileName</span><span class="o">);</span>
</span><span class='line'>        <span class="o">}</span>
</span><span class='line'>      <span class="o">}</span>
</span><span class='line'>    <span class="o">}</span>
</span><span class='line'>
</span><span class='line'>    <span class="kd">private</span> <span class="kt">void</span> <span class="nf">parseSkipFile</span><span class="o">(</span><span class="n">String</span> <span class="n">fileName</span><span class="o">)</span> <span class="o">{</span>
</span><span class='line'>      <span class="k">try</span> <span class="o">{</span>
</span><span class='line'>        <span class="n">fis</span> <span class="o">=</span> <span class="k">new</span> <span class="nf">BufferedReader</span><span class="o">(</span><span class="k">new</span> <span class="nf">FileReader</span><span class="o">(</span><span class="n">fileName</span><span class="o">));</span>
</span><span class='line'>        <span class="n">String</span> <span class="n">pattern</span> <span class="o">=</span> <span class="kc">null</span><span class="o">;</span>
</span><span class='line'>        <span class="k">while</span> <span class="o">((</span><span class="n">pattern</span> <span class="o">=</span> <span class="n">fis</span><span class="o">.</span><span class="na">readLine</span><span class="o">())</span> <span class="o">!=</span> <span class="kc">null</span><span class="o">)</span> <span class="o">{</span>
</span><span class='line'>          <span class="n">patternsToSkip</span><span class="o">.</span><span class="na">add</span><span class="o">(</span><span class="n">pattern</span><span class="o">);</span>
</span><span class='line'>        <span class="o">}</span>
</span><span class='line'>      <span class="o">}</span> <span class="k">catch</span> <span class="o">(</span><span class="n">IOException</span> <span class="n">ioe</span><span class="o">)</span> <span class="o">{</span>
</span><span class='line'>        <span class="n">System</span><span class="o">.</span><span class="na">err</span><span class="o">.</span><span class="na">println</span><span class="o">(</span><span class="s">&quot;Caught exception while parsing the cached file &#39;&quot;</span>
</span><span class='line'>            <span class="o">+</span> <span class="n">StringUtils</span><span class="o">.</span><span class="na">stringifyException</span><span class="o">(</span><span class="n">ioe</span><span class="o">));</span>
</span><span class='line'>      <span class="o">}</span>
</span><span class='line'>    <span class="o">}</span>
</span><span class='line'>
</span><span class='line'>    <span class="nd">@Override</span>
</span><span class='line'>    <span class="kd">public</span> <span class="kt">void</span> <span class="nf">map</span><span class="o">(</span><span class="n">Object</span> <span class="n">key</span><span class="o">,</span> <span class="n">Text</span> <span class="n">value</span><span class="o">,</span> <span class="n">Context</span> <span class="n">context</span>
</span><span class='line'>                    <span class="o">)</span> <span class="kd">throws</span> <span class="n">IOException</span><span class="o">,</span> <span class="n">InterruptedException</span> <span class="o">{</span>
</span><span class='line'>      <span class="n">String</span> <span class="n">line</span> <span class="o">=</span> <span class="o">(</span><span class="n">caseSensitive</span><span class="o">)</span> <span class="o">?</span>
</span><span class='line'>          <span class="n">value</span><span class="o">.</span><span class="na">toString</span><span class="o">()</span> <span class="o">:</span> <span class="n">value</span><span class="o">.</span><span class="na">toString</span><span class="o">().</span><span class="na">toLowerCase</span><span class="o">();</span>
</span><span class='line'>      <span class="k">for</span> <span class="o">(</span><span class="n">String</span> <span class="n">pattern</span> <span class="o">:</span> <span class="n">patternsToSkip</span><span class="o">)</span> <span class="o">{</span>
</span><span class='line'>        <span class="n">line</span> <span class="o">=</span> <span class="n">line</span><span class="o">.</span><span class="na">replaceAll</span><span class="o">(</span><span class="n">pattern</span><span class="o">,</span> <span class="s">&quot;&quot;</span><span class="o">);</span>
</span><span class='line'>      <span class="o">}</span>
</span><span class='line'>      <span class="n">StringTokenizer</span> <span class="n">itr</span> <span class="o">=</span> <span class="k">new</span> <span class="nf">StringTokenizer</span><span class="o">(</span><span class="n">line</span><span class="o">);</span>
</span><span class='line'>      <span class="k">while</span> <span class="o">(</span><span class="n">itr</span><span class="o">.</span><span class="na">hasMoreTokens</span><span class="o">())</span> <span class="o">{</span>
</span><span class='line'>        <span class="n">word</span><span class="o">.</span><span class="na">set</span><span class="o">(</span><span class="n">itr</span><span class="o">.</span><span class="na">nextToken</span><span class="o">());</span>
</span><span class='line'>        <span class="n">context</span><span class="o">.</span><span class="na">write</span><span class="o">(</span><span class="n">word</span><span class="o">,</span> <span class="n">one</span><span class="o">);</span>
</span><span class='line'>        <span class="n">Counter</span> <span class="n">counter</span> <span class="o">=</span> <span class="n">context</span><span class="o">.</span><span class="na">getCounter</span><span class="o">(</span><span class="n">CountersEnum</span><span class="o">.</span><span class="na">class</span><span class="o">.</span><span class="na">getName</span><span class="o">(),</span>
</span><span class='line'>            <span class="n">CountersEnum</span><span class="o">.</span><span class="na">INPUT_WORDS</span><span class="o">.</span><span class="na">toString</span><span class="o">());</span>
</span><span class='line'>        <span class="n">counter</span><span class="o">.</span><span class="na">increment</span><span class="o">(</span><span class="mi">1</span><span class="o">);</span>
</span><span class='line'>      <span class="o">}</span>
</span><span class='line'>    <span class="o">}</span>
</span><span class='line'>  <span class="o">}</span>
</span><span class='line'>
</span><span class='line'>  <span class="kd">public</span> <span class="kd">static</span> <span class="kd">class</span> <span class="nc">IntSumReducer</span>
</span><span class='line'>       <span class="kd">extends</span> <span class="n">Reducer</span><span class="o">&lt;</span><span class="n">Text</span><span class="o">,</span><span class="n">IntWritable</span><span class="o">,</span><span class="n">Text</span><span class="o">,</span><span class="n">IntWritable</span><span class="o">&gt;</span> <span class="o">{</span>
</span><span class='line'>    <span class="kd">private</span> <span class="n">IntWritable</span> <span class="n">result</span> <span class="o">=</span> <span class="k">new</span> <span class="nf">IntWritable</span><span class="o">();</span>
</span><span class='line'>
</span><span class='line'>    <span class="kd">public</span> <span class="kt">void</span> <span class="nf">reduce</span><span class="o">(</span><span class="n">Text</span> <span class="n">key</span><span class="o">,</span> <span class="n">Iterable</span><span class="o">&lt;</span><span class="n">IntWritable</span><span class="o">&gt;</span> <span class="n">values</span><span class="o">,</span>
</span><span class='line'>                       <span class="n">Context</span> <span class="n">context</span>
</span><span class='line'>                       <span class="o">)</span> <span class="kd">throws</span> <span class="n">IOException</span><span class="o">,</span> <span class="n">InterruptedException</span> <span class="o">{</span>
</span><span class='line'>      <span class="kt">int</span> <span class="n">sum</span> <span class="o">=</span> <span class="mi">0</span><span class="o">;</span>
</span><span class='line'>      <span class="k">for</span> <span class="o">(</span><span class="n">IntWritable</span> <span class="n">val</span> <span class="o">:</span> <span class="n">values</span><span class="o">)</span> <span class="o">{</span>
</span><span class='line'>        <span class="n">sum</span> <span class="o">+=</span> <span class="n">val</span><span class="o">.</span><span class="na">get</span><span class="o">();</span>
</span><span class='line'>      <span class="o">}</span>
</span><span class='line'>      <span class="n">result</span><span class="o">.</span><span class="na">set</span><span class="o">(</span><span class="n">sum</span><span class="o">);</span>
</span><span class='line'>      <span class="n">context</span><span class="o">.</span><span class="na">write</span><span class="o">(</span><span class="n">key</span><span class="o">,</span> <span class="n">result</span><span class="o">);</span>
</span><span class='line'>    <span class="o">}</span>
</span><span class='line'>  <span class="o">}</span>
</span><span class='line'>
</span><span class='line'>  <span class="kd">public</span> <span class="kd">static</span> <span class="kt">void</span> <span class="nf">main</span><span class="o">(</span><span class="n">String</span><span class="o">[]</span> <span class="n">args</span><span class="o">)</span> <span class="kd">throws</span> <span class="n">Exception</span> <span class="o">{</span>
</span><span class='line'>    <span class="n">Configuration</span> <span class="n">conf</span> <span class="o">=</span> <span class="k">new</span> <span class="nf">Configuration</span><span class="o">();</span>
</span><span class='line'>    <span class="n">GenericOptionsParser</span> <span class="n">optionParser</span> <span class="o">=</span> <span class="k">new</span> <span class="nf">GenericOptionsParser</span><span class="o">(</span><span class="n">conf</span><span class="o">,</span> <span class="n">args</span><span class="o">);</span>
</span><span class='line'>    <span class="n">String</span><span class="o">[]</span> <span class="n">remainingArgs</span> <span class="o">=</span> <span class="n">optionParser</span><span class="o">.</span><span class="na">getRemainingArgs</span><span class="o">();</span>
</span><span class='line'>    <span class="k">if</span> <span class="o">(!(</span><span class="n">remainingArgs</span><span class="o">.</span><span class="na">length</span> <span class="o">!=</span> <span class="mi">2</span> <span class="o">|</span> <span class="o">|</span> <span class="n">remainingArgs</span><span class="o">.</span><span class="na">length</span> <span class="o">!=</span> <span class="mi">4</span><span class="o">))</span> <span class="o">{</span>
</span><span class='line'>      <span class="n">System</span><span class="o">.</span><span class="na">err</span><span class="o">.</span><span class="na">println</span><span class="o">(</span><span class="s">&quot;Usage: wordcount &lt;in&gt; &lt;out&gt; [-skip skipPatternFile]&quot;</span><span class="o">);</span>
</span><span class='line'>      <span class="n">System</span><span class="o">.</span><span class="na">exit</span><span class="o">(</span><span class="mi">2</span><span class="o">);</span>
</span><span class='line'>    <span class="o">}</span>
</span><span class='line'>    <span class="n">Job</span> <span class="n">job</span> <span class="o">=</span> <span class="n">Job</span><span class="o">.</span><span class="na">getInstance</span><span class="o">(</span><span class="n">conf</span><span class="o">,</span> <span class="s">&quot;word count&quot;</span><span class="o">);</span>
</span><span class='line'>    <span class="n">job</span><span class="o">.</span><span class="na">setJarByClass</span><span class="o">(</span><span class="n">WordCount2</span><span class="o">.</span><span class="na">class</span><span class="o">);</span>
</span><span class='line'>    <span class="n">job</span><span class="o">.</span><span class="na">setMapperClass</span><span class="o">(</span><span class="n">TokenizerMapper</span><span class="o">.</span><span class="na">class</span><span class="o">);</span>
</span><span class='line'>    <span class="n">job</span><span class="o">.</span><span class="na">setCombinerClass</span><span class="o">(</span><span class="n">IntSumReducer</span><span class="o">.</span><span class="na">class</span><span class="o">);</span>
</span><span class='line'>    <span class="n">job</span><span class="o">.</span><span class="na">setReducerClass</span><span class="o">(</span><span class="n">IntSumReducer</span><span class="o">.</span><span class="na">class</span><span class="o">);</span>
</span><span class='line'>    <span class="n">job</span><span class="o">.</span><span class="na">setOutputKeyClass</span><span class="o">(</span><span class="n">Text</span><span class="o">.</span><span class="na">class</span><span class="o">);</span>
</span><span class='line'>    <span class="n">job</span><span class="o">.</span><span class="na">setOutputValueClass</span><span class="o">(</span><span class="n">IntWritable</span><span class="o">.</span><span class="na">class</span><span class="o">);</span>
</span><span class='line'>
</span><span class='line'>    <span class="n">List</span><span class="o">&lt;</span><span class="n">String</span><span class="o">&gt;</span> <span class="n">otherArgs</span> <span class="o">=</span> <span class="k">new</span> <span class="n">ArrayList</span><span class="o">&lt;</span><span class="n">String</span><span class="o">&gt;();</span>
</span><span class='line'>    <span class="k">for</span> <span class="o">(</span><span class="kt">int</span> <span class="n">i</span><span class="o">=</span><span class="mi">0</span><span class="o">;</span> <span class="n">i</span> <span class="o">&lt;</span> <span class="n">remainingArgs</span><span class="o">.</span><span class="na">length</span><span class="o">;</span> <span class="o">++</span><span class="n">i</span><span class="o">)</span> <span class="o">{</span>
</span><span class='line'>      <span class="k">if</span> <span class="o">(</span><span class="s">&quot;-skip&quot;</span><span class="o">.</span><span class="na">equals</span><span class="o">(</span><span class="n">remainingArgs</span><span class="o">[</span><span class="n">i</span><span class="o">]))</span> <span class="o">{</span>
</span><span class='line'>        <span class="n">job</span><span class="o">.</span><span class="na">addCacheFile</span><span class="o">(</span><span class="k">new</span> <span class="nf">Path</span><span class="o">(</span><span class="n">remainingArgs</span><span class="o">[++</span><span class="n">i</span><span class="o">]).</span><span class="na">toUri</span><span class="o">());</span>
</span><span class='line'>        <span class="n">job</span><span class="o">.</span><span class="na">getConfiguration</span><span class="o">().</span><span class="na">setBoolean</span><span class="o">(</span><span class="s">&quot;wordcount.skip.patterns&quot;</span><span class="o">,</span> <span class="kc">true</span><span class="o">);</span>
</span><span class='line'>      <span class="o">}</span> <span class="k">else</span> <span class="o">{</span>
</span><span class='line'>        <span class="n">otherArgs</span><span class="o">.</span><span class="na">add</span><span class="o">(</span><span class="n">remainingArgs</span><span class="o">[</span><span class="n">i</span><span class="o">]);</span>
</span><span class='line'>      <span class="o">}</span>
</span><span class='line'>    <span class="o">}</span>
</span><span class='line'>    <span class="n">FileInputFormat</span><span class="o">.</span><span class="na">addInputPath</span><span class="o">(</span><span class="n">job</span><span class="o">,</span> <span class="k">new</span> <span class="nf">Path</span><span class="o">(</span><span class="n">otherArgs</span><span class="o">.</span><span class="na">get</span><span class="o">(</span><span class="mi">0</span><span class="o">)));</span>
</span><span class='line'>    <span class="n">FileOutputFormat</span><span class="o">.</span><span class="na">setOutputPath</span><span class="o">(</span><span class="n">job</span><span class="o">,</span> <span class="k">new</span> <span class="nf">Path</span><span class="o">(</span><span class="n">otherArgs</span><span class="o">.</span><span class="na">get</span><span class="o">(</span><span class="mi">1</span><span class="o">)));</span>
</span><span class='line'>
</span><span class='line'>    <span class="n">System</span><span class="o">.</span><span class="na">exit</span><span class="o">(</span><span class="n">job</span><span class="o">.</span><span class="na">waitForCompletion</span><span class="o">(</span><span class="kc">true</span><span class="o">)</span> <span class="o">?</span> <span class="mi">0</span> <span class="o">:</span> <span class="mi">1</span><span class="o">);</span>
</span><span class='line'>  <span class="o">}</span>
</span><span class='line'><span class="o">}</span>
</span></code></pre></td></tr></table></div></figure>


<p>That&rsquo;s just a word count example, imagine the headaches of anything remotely complex in that paradigm and you&rsquo;ll do the same as I did and swear off Hadoop.</p>

<p>Are there workarounds for it? Yes. You can also put a nice coat of paint on an old rusty car. It&rsquo;ll look nicer but it&rsquo;s not fooling anyone what&rsquo;s under the hood.</p>

<h2>Spark can do better</h2>

<p>Remember that word count example? Scala and Spark do it substantially better:</p>

<figure class='code'><figcaption><span>spark-wordcount-example</span><a href='https://spark.apache.org/examples.html'>link</a></figcaption><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
<span class='line-number'>3</span>
<span class='line-number'>4</span>
<span class='line-number'>5</span>
</pre></td><td class='code'><pre><code class='scala'><span class='line'><span class="k">val</span> <span class="n">textFile</span> <span class="k">=</span> <span class="n">spark</span><span class="o">.</span><span class="n">textFile</span><span class="o">(</span><span class="s">&quot;hdfs://...&quot;</span><span class="o">)</span>
</span><span class='line'><span class="k">val</span> <span class="n">counts</span> <span class="k">=</span> <span class="n">textFile</span><span class="o">.</span><span class="n">flatMap</span><span class="o">(</span><span class="n">line</span> <span class="k">=&gt;</span> <span class="n">line</span><span class="o">.</span><span class="n">split</span><span class="o">(</span><span class="s">&quot; &quot;</span><span class="o">))</span>
</span><span class='line'>                 <span class="o">.</span><span class="n">map</span><span class="o">(</span><span class="n">word</span> <span class="k">=&gt;</span> <span class="o">(</span><span class="n">word</span><span class="o">,</span> <span class="mi">1</span><span class="o">))</span>
</span><span class='line'>                 <span class="o">.</span><span class="n">reduceByKey</span><span class="o">(</span><span class="k">_</span> <span class="o">+</span> <span class="k">_</span><span class="o">)</span>
</span><span class='line'><span class="n">counts</span><span class="o">.</span><span class="n">saveAsTextFile</span><span class="o">(</span><span class="s">&quot;hdfs://...&quot;</span><span class="o">)</span>
</span></code></pre></td></tr></table></div></figure>


<p>Even Python leaves it in the dust:</p>

<figure class='code'><figcaption><span>python-wordcount-example</span><a href='https://spark.apache.org/examples.html'>link</a></figcaption><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
<span class='line-number'>3</span>
<span class='line-number'>4</span>
<span class='line-number'>5</span>
</pre></td><td class='code'><pre><code class='python'><span class='line'><span class="n">text_file</span> <span class="o">=</span> <span class="n">spark</span><span class="o">.</span><span class="n">textFile</span><span class="p">(</span><span class="s">&quot;hdfs://...&quot;</span><span class="p">)</span>
</span><span class='line'><span class="n">counts</span> <span class="o">=</span> <span class="n">text_file</span><span class="o">.</span><span class="n">flatMap</span><span class="p">(</span><span class="k">lambda</span> <span class="n">line</span><span class="p">:</span> <span class="n">line</span><span class="o">.</span><span class="n">split</span><span class="p">(</span><span class="s">&quot; &quot;</span><span class="p">))</span> \
</span><span class='line'>             <span class="o">.</span><span class="n">map</span><span class="p">(</span><span class="k">lambda</span> <span class="n">word</span><span class="p">:</span> <span class="p">(</span><span class="n">word</span><span class="p">,</span> <span class="mi">1</span><span class="p">))</span> \
</span><span class='line'>             <span class="o">.</span><span class="n">reduceByKey</span><span class="p">(</span><span class="k">lambda</span> <span class="n">a</span><span class="p">,</span> <span class="n">b</span><span class="p">:</span> <span class="n">a</span> <span class="o">+</span> <span class="n">b</span><span class="p">)</span>
</span><span class='line'><span class="n">counts</span><span class="o">.</span><span class="n">saveAsTextFile</span><span class="p">(</span><span class="s">&quot;hdfs://...&quot;</span><span class="p">)</span>
</span></code></pre></td></tr></table></div></figure>


<p>You don&rsquo;t have to spend five seconds scrolling to get through that one. The point is that by defining a mapreduce task in terms of functions, we only need to tell Spark what actions it&rsquo;s taking on the data. We&rsquo;ll get to what all this means later.</p>

<h2>Why Scala over Python?</h2>

<p>I advocate the usage of Scala in general for Big Data problems over Python. The reasoning is that Scala is a Statically typed language, surprisingly moreso than even Java (we&rsquo;ll cover that in a moment.) Spark was also written in Scala, meaning its DSL is going to be very familiar if you&rsquo;re any grade of Scala programmer.</p>

<p>Python is a great language, don&rsquo;t get me wrong, but it&rsquo;s not fully functional. You&rsquo;ll see why that&rsquo;s a big deal in a moment here. That, and I don&rsquo;t like typing <code>lambda</code> all the time.</p>

<h2>Basics of Functional Programming</h2>

<p>So what is Functional Programming, besides the most thrown around concept in modern days? Quite simply it&rsquo;s a program built up from Functions instead of Objects.</p>

<p>A little bit more into it, Functional Programming embraces a few interesting ideals:</p>

<ul>
<li>The REPL - That&rsquo;s Read Evaluate Print Loop, a program for running code inline much like a Unix Shell</li>
<li>Mutation is forbidden - All variables are final</li>
<li>Functional purity - If you pass <code>A</code> into a function, you&rsquo;re always getting <code>B</code> back</li>
<li>Nil is dead - Null pointer exceptions begone!</li>
<li>Programs are composed of functions - Think writing a program on terms of verbs instead of nouns</li>
<li>Functions are first class citizens - You can pass functions as arguments, and even return them</li>
<li>Laziness is useful - Functions and values that don&rsquo;t evaluate until they&rsquo;re called</li>
</ul>


<p>We&rsquo;ll be getting into each of those and why they&rsquo;re relevant to Spark here in a moment.</p>

<h2>Before we get too far</h2>

<p>You&rsquo;re going to want to get Scala or Python installed so we can get at their REPLs. When you have them installed, drop into your terminal shell and type in either <code>scala</code> or <code>python</code> to drop into a REPL. Give it a swing real quick:</p>

<figure class='code'><figcaption><span>scala-repl</span></figcaption><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
<span class='line-number'>3</span>
<span class='line-number'>4</span>
<span class='line-number'>5</span>
<span class='line-number'>6</span>
</pre></td><td class='code'><pre><code class='scala'><span class='line'><span class="nc">Welcome</span> <span class="n">to</span> <span class="nc">Scala</span> <span class="n">version</span> <span class="mf">2.11</span><span class="o">.</span><span class="mi">5</span> <span class="o">(</span><span class="nc">Java</span> <span class="nc">HotSpot</span><span class="o">(</span><span class="nc">TM</span><span class="o">)</span> <span class="mi">64</span><span class="o">-</span><span class="nc">Bit</span> <span class="nc">Server</span> <span class="nc">VM</span><span class="o">,</span> <span class="nc">Java</span> <span class="mf">1.8</span><span class="o">.</span><span class="mi">0</span><span class="n">_31</span><span class="o">).</span>
</span><span class='line'><span class="nc">Type</span> <span class="n">in</span> <span class="n">expressions</span> <span class="n">to</span> <span class="n">have</span> <span class="n">them</span> <span class="n">evaluated</span><span class="o">.</span>
</span><span class='line'><span class="nc">Type</span> <span class="k">:</span><span class="kt">help</span> <span class="kt">for</span> <span class="kt">more</span> <span class="kt">information.</span>
</span><span class='line'>
</span><span class='line'><span class="kt">scala&gt;</span> <span class="err">5</span> <span class="kt">+</span> <span class="err">5</span>
</span><span class='line'><span class="kt">res0:</span> <span class="kt">Int</span> <span class="o">=</span> <span class="mi">10</span>
</span></code></pre></td></tr></table></div></figure>




<figure class='code'><figcaption><span>python-repl</span></figcaption><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
<span class='line-number'>3</span>
<span class='line-number'>4</span>
<span class='line-number'>5</span>
</pre></td><td class='code'><pre><code class='python'><span class='line'><span class="n">Python</span> <span class="mf">2.7</span><span class="o">.</span><span class="mi">5</span> <span class="p">(</span><span class="n">default</span><span class="p">,</span> <span class="n">Mar</span>  <span class="mi">9</span> <span class="mi">2014</span><span class="p">,</span> <span class="mi">22</span><span class="p">:</span><span class="mi">15</span><span class="p">:</span><span class="mo">05</span><span class="p">)</span>
</span><span class='line'><span class="p">[</span><span class="n">GCC</span> <span class="mf">4.2</span><span class="o">.</span><span class="mi">1</span> <span class="n">Compatible</span> <span class="n">Apple</span> <span class="n">LLVM</span> <span class="mf">5.0</span> <span class="p">(</span><span class="n">clang</span><span class="o">-</span><span class="mf">500.0</span><span class="o">.</span><span class="mi">68</span><span class="p">)]</span> <span class="n">on</span> <span class="n">darwin</span>
</span><span class='line'><span class="n">Type</span> <span class="s">&quot;help&quot;</span><span class="p">,</span> <span class="s">&quot;copyright&quot;</span><span class="p">,</span> <span class="s">&quot;credits&quot;</span> <span class="ow">or</span> <span class="s">&quot;license&quot;</span> <span class="k">for</span> <span class="n">more</span> <span class="n">information</span><span class="o">.</span>
</span><span class='line'><span class="o">&gt;&gt;&gt;</span> <span class="mi">5</span> <span class="o">+</span> <span class="mi">5</span>
</span><span class='line'><span class="mi">10</span>
</span></code></pre></td></tr></table></div></figure>


<p>In terms of Functional Programming, and later Spark, the REPL will quickly become your best friend.</p>

<h2>A Function</h2>

<p>So what&rsquo;s a function? Let&rsquo;s give the REPL a whirl:</p>

<figure class='code'><figcaption><span>scala-basic-function</span></figcaption><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
<span class='line-number'>3</span>
<span class='line-number'>4</span>
<span class='line-number'>5</span>
</pre></td><td class='code'><pre><code class='scala'><span class='line'><span class="n">scala</span><span class="o">&gt;</span> <span class="k">def</span> <span class="n">add2</span><span class="o">(</span><span class="n">x</span><span class="k">:</span><span class="kt">Int</span><span class="o">)</span> <span class="k">=</span> <span class="n">x</span> <span class="o">+</span> <span class="mi">2</span>
</span><span class='line'><span class="n">add2</span><span class="k">:</span> <span class="o">(</span><span class="kt">x:</span> <span class="kt">Int</span><span class="o">)</span><span class="kt">Int</span>
</span><span class='line'>
</span><span class='line'><span class="n">scala</span><span class="o">&gt;</span> <span class="n">add2</span><span class="o">(</span><span class="mi">3</span><span class="o">)</span>
</span><span class='line'><span class="n">res1</span><span class="k">:</span> <span class="kt">Int</span> <span class="o">=</span> <span class="mi">5</span>
</span></code></pre></td></tr></table></div></figure>


<p>Fair warning that Python wants you to put a blank line before it assumes you&rsquo;re done typing</p>

<figure class='code'><figcaption><span>python-basic-function</span></figcaption><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
<span class='line-number'>3</span>
<span class='line-number'>4</span>
</pre></td><td class='code'><pre><code class='python'><span class='line'><span class="o">&gt;&gt;&gt;</span> <span class="k">def</span> <span class="nf">add2</span><span class="p">(</span><span class="n">x</span><span class="p">):</span> <span class="k">return</span> <span class="n">x</span> <span class="o">+</span> <span class="mi">2</span>
</span><span class='line'><span class="o">...</span>
</span><span class='line'><span class="o">&gt;&gt;&gt;</span> <span class="n">add2</span><span class="p">(</span><span class="mi">3</span><span class="p">)</span>
</span><span class='line'><span class="mi">5</span>
</span></code></pre></td></tr></table></div></figure>


<p>Notice something interesting about Scala there? There&rsquo;s no need for a return. It&rsquo;s implied that the last statement in a function is the return value. You&rsquo;ll also notice that the Scala REPL guessed that we&rsquo;re going to return an Integer as well, as per the method signature.</p>

<p>Now what&rsquo;s in a method signature? It&rsquo;s a contract, a guarantee of a return type. This brings us to our next concept</p>

<h2>Goodbye to Nil</h2>

<p>Now remember when I said that Scala was more statically typed than Java? Try to give <code>add2</code> a <code>nil</code> and see what happens in both languages:</p>

<figure class='code'><figcaption><span>scala-none-function</span></figcaption><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
<span class='line-number'>3</span>
<span class='line-number'>4</span>
<span class='line-number'>5</span>
</pre></td><td class='code'><pre><code class='scala'><span class='line'><span class="n">scala</span><span class="o">&gt;</span> <span class="n">add2</span><span class="o">(</span><span class="nc">None</span><span class="o">)</span>
</span><span class='line'><span class="o">&lt;</span><span class="n">console</span><span class="k">&gt;:</span><span class="mi">9</span><span class="k">:</span> <span class="kt">error:</span> <span class="k">type</span> <span class="kt">mismatch</span><span class="o">;</span>
</span><span class='line'> <span class="n">found</span>   <span class="k">:</span> <span class="kt">None.type</span>
</span><span class='line'> <span class="n">required</span><span class="k">:</span> <span class="kt">Int</span>
</span><span class='line'>              <span class="n">add2</span><span class="o">(</span><span class="nc">None</span><span class="o">)</span>
</span></code></pre></td></tr></table></div></figure>




<figure class='code'><figcaption><span>python-none-function</span></figcaption><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
<span class='line-number'>3</span>
<span class='line-number'>4</span>
<span class='line-number'>5</span>
</pre></td><td class='code'><pre><code class='python'><span class='line'><span class="o">&gt;&gt;&gt;</span> <span class="n">add2</span><span class="p">(</span><span class="bp">None</span><span class="p">)</span>
</span><span class='line'><span class="n">Traceback</span> <span class="p">(</span><span class="n">most</span> <span class="n">recent</span> <span class="n">call</span> <span class="n">last</span><span class="p">):</span>
</span><span class='line'>  <span class="n">File</span> <span class="s">&quot;&lt;stdin&gt;&quot;</span><span class="p">,</span> <span class="n">line</span> <span class="mi">1</span><span class="p">,</span> <span class="ow">in</span> <span class="o">&lt;</span><span class="n">module</span><span class="o">&gt;</span>
</span><span class='line'>  <span class="n">File</span> <span class="s">&quot;&lt;stdin&gt;&quot;</span><span class="p">,</span> <span class="n">line</span> <span class="mi">1</span><span class="p">,</span> <span class="ow">in</span> <span class="n">add2</span>
</span><span class='line'><span class="ne">TypeError</span><span class="p">:</span> <span class="n">unsupported</span> <span class="n">operand</span> <span class="nb">type</span><span class="p">(</span><span class="n">s</span><span class="p">)</span> <span class="k">for</span> <span class="o">+</span><span class="p">:</span> <span class="s">&#39;NoneType&#39;</span> <span class="ow">and</span> <span class="s">&#39;int&#39;</span>
</span></code></pre></td></tr></table></div></figure>


<p>But that&rsquo;s not <code>nil</code>! That&rsquo;s something called <code>None</code>!</p>

<p>So what&rsquo;s the difference? Put briefly, Scala and Python both do not have a concept of <code>nil</code> which is a very very good thing for us.</p>

<p>Unless we explicitly tell Scala it can take a <code>None</code>, it will always throw a type error. So how do we define something that might take a value or might not? That&rsquo;s what we have <code>Option</code> for:</p>

<figure class='code'><figcaption><span></span></figcaption><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
<span class='line-number'>3</span>
<span class='line-number'>4</span>
<span class='line-number'>5</span>
<span class='line-number'>6</span>
<span class='line-number'>7</span>
<span class='line-number'>8</span>
</pre></td><td class='code'><pre><code class='scala'><span class='line'><span class="n">scala</span><span class="o">&gt;</span> <span class="k">def</span> <span class="n">add2Maybe</span><span class="o">(</span><span class="n">x</span><span class="k">:</span><span class="kt">Option</span><span class="o">[</span><span class="kt">Int</span><span class="o">])</span> <span class="k">=</span> <span class="n">x</span><span class="o">.</span><span class="n">getOrElse</span><span class="o">(</span><span class="mi">0</span><span class="o">)</span> <span class="o">+</span> <span class="mi">2</span>
</span><span class='line'><span class="n">add2Maybe</span><span class="k">:</span> <span class="o">(</span><span class="kt">x:</span> <span class="kt">Option</span><span class="o">[</span><span class="kt">Int</span><span class="o">])</span><span class="nc">Int</span>
</span><span class='line'>
</span><span class='line'><span class="n">scala</span><span class="o">&gt;</span> <span class="n">add2Maybe</span><span class="o">(</span><span class="nc">Some</span><span class="o">(</span><span class="mi">2</span><span class="o">))</span>
</span><span class='line'><span class="n">res6</span><span class="k">:</span> <span class="kt">Int</span> <span class="o">=</span> <span class="mi">4</span>
</span><span class='line'>
</span><span class='line'><span class="n">scala</span><span class="o">&gt;</span> <span class="n">add2Maybe</span><span class="o">(</span><span class="nc">None</span><span class="o">)</span>
</span><span class='line'><span class="n">res7</span><span class="k">:</span> <span class="kt">Int</span> <span class="o">=</span> <span class="mi">2</span>
</span></code></pre></td></tr></table></div></figure>


<p>Python does not have this concept, it only replaces <code>nil</code> with <code>None</code>.</p>

<h2>Higher Order Functions and Map</h2>

<p>One of the most powerful concepts in Functional Programming is the ability to pass functions as arguments. By doing this we&rsquo;re afforded a great deal of flexibility in defining abstract interfaces for basic operations.</p>

<p><strong>WARNING</strong> Python users, normally you&rsquo;re going to want to use List Comprehensions for this type of thing. Since you&rsquo;re going to be applying this to Spark, it&rsquo;s necessary to know these types of functions.</p>

<p>Take <code>map</code> for instance, a function that applies a function to a list. This will be confusing for first timers in this territory, so stick with me for a bit here:</p>

<figure class='code'><figcaption><span>scala-map</span></figcaption><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
</pre></td><td class='code'><pre><code class='scala'><span class='line'><span class="n">scala</span><span class="o">&gt;</span> <span class="nc">List</span><span class="o">(</span><span class="mi">1</span><span class="o">,</span><span class="mi">2</span><span class="o">,</span><span class="mi">3</span><span class="o">,</span><span class="mi">4</span><span class="o">).</span><span class="n">map</span> <span class="o">{</span> <span class="n">i</span> <span class="k">=&gt;</span> <span class="n">i</span> <span class="o">*</span> <span class="mi">2</span> <span class="o">}</span>
</span></code></pre></td></tr></table></div></figure>




<figure class='code'><figcaption><span>python-map</span></figcaption><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
</pre></td><td class='code'><pre><code class='python'><span class='line'><span class="o">&gt;&gt;&gt;</span> <span class="nb">map</span><span class="p">(</span><span class="k">lambda</span> <span class="n">x</span><span class="p">:</span> <span class="n">x</span> <span class="o">*</span> <span class="mi">2</span><span class="p">,</span> <span class="p">[</span><span class="mi">1</span><span class="p">,</span><span class="mi">2</span><span class="p">,</span><span class="mi">3</span><span class="p">,</span><span class="mi">4</span><span class="p">])</span>
</span></code></pre></td></tr></table></div></figure>


<p>Now what do you suppose those two do? We&rsquo;re passing in a function that takes an argument <code>x</code> and returns <code>x * 2</code>. We&rsquo;re applying that function to each element in the list, so let&rsquo;s step through this in Scala:</p>

<figure class='code'><figcaption><span></span></figcaption><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
<span class='line-number'>3</span>
<span class='line-number'>4</span>
<span class='line-number'>5</span>
<span class='line-number'>6</span>
<span class='line-number'>7</span>
<span class='line-number'>8</span>
<span class='line-number'>9</span>
<span class='line-number'>10</span>
<span class='line-number'>11</span>
<span class='line-number'>12</span>
<span class='line-number'>13</span>
<span class='line-number'>14</span>
<span class='line-number'>15</span>
<span class='line-number'>16</span>
</pre></td><td class='code'><pre><code class='scala'><span class='line'><span class="c1">// val is short for value, or an immutable variable</span>
</span><span class='line'><span class="n">scala</span><span class="o">&gt;</span> <span class="k">val</span> <span class="n">myList</span> <span class="k">=</span> <span class="nc">List</span><span class="o">(</span><span class="mi">1</span><span class="o">,</span><span class="mi">2</span><span class="o">,</span><span class="mi">3</span><span class="o">,</span><span class="mi">4</span><span class="o">)</span>
</span><span class='line'><span class="n">myList</span><span class="k">:</span> <span class="kt">List</span><span class="o">[</span><span class="kt">Int</span><span class="o">]</span> <span class="k">=</span> <span class="nc">List</span><span class="o">(</span><span class="mi">1</span><span class="o">,</span> <span class="mi">2</span><span class="o">,</span> <span class="mi">3</span><span class="o">,</span> <span class="mi">4</span><span class="o">)</span>
</span><span class='line'>
</span><span class='line'><span class="n">scala</span><span class="o">&gt;</span> <span class="n">myList</span><span class="o">.</span><span class="n">map</span> <span class="o">{</span> <span class="n">i</span> <span class="k">=&gt;</span> <span class="n">i</span> <span class="o">*</span> <span class="mi">2</span> <span class="o">}</span>
</span><span class='line'>
</span><span class='line'><span class="c1">// First iteration:  i is 1, returns 2</span>
</span><span class='line'><span class="c1">// Second iteration: i is 2, returns 4</span>
</span><span class='line'><span class="c1">// Third iteration:  i is 3, returns 6</span>
</span><span class='line'><span class="c1">// Fourth iteration: i is 4, returns 8</span>
</span><span class='line'><span class="c1">// ...and now we have a new list returned:</span>
</span><span class='line'><span class="n">res8</span><span class="k">:</span> <span class="kt">List</span><span class="o">[</span><span class="kt">Int</span><span class="o">]</span> <span class="k">=</span> <span class="nc">List</span><span class="o">(</span><span class="mi">2</span><span class="o">,</span> <span class="mi">4</span><span class="o">,</span> <span class="mi">6</span><span class="o">,</span> <span class="mi">8</span><span class="o">)</span>
</span><span class='line'>
</span><span class='line'><span class="c1">// You remember I said immutable? What&#39;s myList right now?</span>
</span><span class='line'><span class="n">scala</span><span class="o">&gt;</span> <span class="n">myList</span>
</span><span class='line'><span class="n">res9</span><span class="k">:</span> <span class="kt">List</span><span class="o">[</span><span class="kt">Int</span><span class="o">]</span> <span class="k">=</span> <span class="nc">List</span><span class="o">(</span><span class="mi">1</span><span class="o">,</span> <span class="mi">2</span><span class="o">,</span> <span class="mi">3</span><span class="o">,</span> <span class="mi">4</span><span class="o">)</span>
</span></code></pre></td></tr></table></div></figure>


<p>So not only did we double each element in the list, but our original list is untouched. Given this, we can transform <code>myList</code> however we want and it&rsquo;ll never change the value of it. Now if we want that result for something, we can always create a new <code>val</code> to save it.</p>

<p>An aside, Scala is very good about trying to simplify things when it can:</p>

<figure class='code'><figcaption><span>scala-short-map</span></figcaption><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
</pre></td><td class='code'><pre><code class='scala'><span class='line'><span class="n">scala</span><span class="o">&gt;</span> <span class="nc">List</span><span class="o">(</span><span class="mi">1</span><span class="o">,</span><span class="mi">2</span><span class="o">,</span><span class="mi">3</span><span class="o">,</span><span class="mi">4</span><span class="o">)</span> <span class="n">map</span> <span class="o">(</span><span class="k">_</span> <span class="o">*</span> <span class="mi">2</span><span class="o">)</span>
</span><span class='line'><span class="n">res10</span><span class="k">:</span> <span class="kt">List</span><span class="o">[</span><span class="kt">Int</span><span class="o">]</span> <span class="k">=</span> <span class="nc">List</span><span class="o">(</span><span class="mi">2</span><span class="o">,</span> <span class="mi">4</span><span class="o">,</span> <span class="mi">6</span><span class="o">,</span> <span class="mi">8</span><span class="o">)</span>
</span></code></pre></td></tr></table></div></figure>


<p>We don&rsquo;t really need the dot there, and whenever something only takes one parameter Scala will be more than happy to take an underscore to shorten it up for us. While this may seem obscuring to some, it&rsquo;s a very common pattern in Scala. Best to understand what it&rsquo;s doing because no amount of rudimentary googling is going to turn that up without some fidgeting, but such are operator and syntactic sugar searches.</p>

<h2>Higher Order Functions - Filter</h2>

<p>The next function on our list is filter, which takes a function and applies it to each element of a list looking for elements where the result is <code>true</code>:</p>

<figure class='code'><figcaption><span>scala-filter</span></figcaption><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
</pre></td><td class='code'><pre><code class='scala'><span class='line'><span class="n">scala</span><span class="o">&gt;</span> <span class="nc">List</span><span class="o">(</span><span class="mi">1</span><span class="o">,</span><span class="mi">2</span><span class="o">,</span><span class="mi">3</span><span class="o">,</span><span class="mi">4</span><span class="o">)</span> <span class="n">filter</span> <span class="o">(</span><span class="k">_</span> <span class="o">&gt;</span> <span class="mi">2</span><span class="o">)</span>
</span><span class='line'><span class="n">res10</span><span class="k">:</span> <span class="kt">List</span><span class="o">[</span><span class="kt">Int</span><span class="o">]</span> <span class="k">=</span> <span class="nc">List</span><span class="o">(</span><span class="mi">3</span><span class="o">,</span> <span class="mi">4</span><span class="o">)</span>
</span></code></pre></td></tr></table></div></figure>




<figure class='code'><figcaption><span>python-filter</span></figcaption><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
</pre></td><td class='code'><pre><code class='python'><span class='line'><span class="o">&gt;&gt;&gt;</span> <span class="nb">filter</span><span class="p">(</span><span class="k">lambda</span> <span class="n">x</span><span class="p">:</span> <span class="n">x</span> <span class="o">&gt;</span> <span class="mi">2</span><span class="p">,</span> <span class="p">[</span><span class="mi">1</span><span class="p">,</span><span class="mi">2</span><span class="p">,</span><span class="mi">3</span><span class="p">,</span><span class="mi">4</span><span class="p">])</span>
</span><span class='line'><span class="p">[</span><span class="mi">3</span><span class="p">,</span> <span class="mi">4</span><span class="p">]</span>
</span></code></pre></td></tr></table></div></figure>


<h2>Higher Order Functions - Reduce</h2>

<p>This one is going to be a bit trickier, as it takes a function with two arguments: an accumulator and a value. It reduces a list of elements into one element. Now why would you want such a function? Think of something such as a sum. I&rsquo;ll be using longhand here as this is one of the harder first functions to really understand:</p>

<figure class='code'><figcaption><span>scala-reduce</span></figcaption><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
</pre></td><td class='code'><pre><code class='scala'><span class='line'><span class="n">scala</span><span class="o">&gt;</span> <span class="nc">List</span><span class="o">(</span><span class="mi">1</span><span class="o">,</span><span class="mi">2</span><span class="o">,</span><span class="mi">3</span><span class="o">,</span><span class="mi">4</span><span class="o">).</span><span class="n">reduce</span> <span class="o">{</span> <span class="o">(</span><span class="n">accumulator</span><span class="o">,</span> <span class="n">i</span><span class="o">)</span> <span class="k">=&gt;</span> <span class="n">accumulator</span> <span class="o">+</span> <span class="n">i</span> <span class="o">}</span>
</span><span class='line'><span class="n">res11</span><span class="k">:</span> <span class="kt">Int</span> <span class="o">=</span> <span class="mi">10</span>
</span></code></pre></td></tr></table></div></figure>




<figure class='code'><figcaption><span>python-reduce</span></figcaption><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
</pre></td><td class='code'><pre><code class='python'><span class='line'><span class="o">&gt;&gt;&gt;</span> <span class="nb">reduce</span><span class="p">(</span><span class="k">lambda</span> <span class="n">accumulator</span><span class="p">,</span> <span class="n">i</span><span class="p">:</span> <span class="n">accumulator</span> <span class="o">+</span> <span class="n">i</span><span class="p">,</span> <span class="p">[</span><span class="mi">1</span><span class="p">,</span><span class="mi">2</span><span class="p">,</span><span class="mi">3</span><span class="p">,</span><span class="mi">4</span><span class="p">])</span>
</span><span class='line'><span class="mi">10</span>
</span></code></pre></td></tr></table></div></figure>


<p>But how did that work? Let&rsquo;s step through the logic here in Scala:</p>

<figure class='code'><figcaption><span>scala-reduce-explained</span></figcaption><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
<span class='line-number'>3</span>
<span class='line-number'>4</span>
<span class='line-number'>5</span>
<span class='line-number'>6</span>
<span class='line-number'>7</span>
<span class='line-number'>8</span>
<span class='line-number'>9</span>
<span class='line-number'>10</span>
<span class='line-number'>11</span>
<span class='line-number'>12</span>
<span class='line-number'>13</span>
<span class='line-number'>14</span>
<span class='line-number'>15</span>
</pre></td><td class='code'><pre><code class='scala'><span class='line'><span class="n">scala</span><span class="o">&gt;</span> <span class="nc">List</span><span class="o">(</span><span class="mi">1</span><span class="o">,</span><span class="mi">2</span><span class="o">,</span><span class="mi">3</span><span class="o">,</span><span class="mi">4</span><span class="o">).</span><span class="n">reduce</span> <span class="o">{</span> <span class="o">(</span><span class="n">accumulator</span><span class="o">,</span> <span class="n">i</span><span class="o">)</span> <span class="k">=&gt;</span> <span class="n">accumulator</span> <span class="o">+</span> <span class="n">i</span> <span class="o">}</span>
</span><span class='line'>
</span><span class='line'><span class="c1">// In our first iteration, the accumulator is either set to a default value,</span>
</span><span class='line'><span class="c1">// or the head element of the list is used. In this case it&#39;s 1</span>
</span><span class='line'>
</span><span class='line'><span class="c1">// First iteration - accumulator: 1, i: 2 =&gt; 3</span>
</span><span class='line'>
</span><span class='line'><span class="c1">// This function returns 3, which is passed in as the next value of the</span>
</span><span class='line'><span class="c1">// accumulator:</span>
</span><span class='line'>
</span><span class='line'><span class="c1">// Second iteration - accumulator: 3, i: 3 =&gt; 6</span>
</span><span class='line'><span class="c1">// Third iteration  - accumulator: 6, i: 4 =&gt; 10</span>
</span><span class='line'>
</span><span class='line'><span class="c1">// Now we&#39;re out of elements, so reduce returns the accumulator as the result:</span>
</span><span class='line'><span class="n">res11</span><span class="k">:</span> <span class="kt">Int</span> <span class="o">=</span> <span class="mi">10</span>
</span></code></pre></td></tr></table></div></figure>


<p>Naturally there&rsquo;s a shorthand for this:</p>

<figure class='code'><figcaption><span>scala-reduce-shorthand</span></figcaption><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
</pre></td><td class='code'><pre><code class='scala'><span class='line'><span class="n">scala</span><span class="o">&gt;</span> <span class="nc">List</span><span class="o">(</span><span class="mi">1</span><span class="o">,</span><span class="mi">2</span><span class="o">,</span><span class="mi">3</span><span class="o">,</span><span class="mi">4</span><span class="o">).</span><span class="n">reduce</span><span class="o">(</span><span class="k">_</span><span class="o">+</span><span class="k">_</span><span class="o">)</span>
</span><span class='line'><span class="n">res12</span><span class="k">:</span> <span class="kt">Int</span> <span class="o">=</span> <span class="mi">10</span>
</span></code></pre></td></tr></table></div></figure>


<p>An astute reader will notice that we used two underscores here. Scala binds arguments in succession to the underscore, making for a bit more confusion in searching.</p>

<h2>Higher Order Functions - Closures</h2>

<p>One of the really nifty things about Functions in languages like Scala is that they capture their local environment when they&rsquo;re defined. What do I mean by that? Let&rsquo;s take a look:</p>

<figure class='code'><figcaption><span>scala-closure</span></figcaption><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
<span class='line-number'>3</span>
<span class='line-number'>4</span>
<span class='line-number'>5</span>
<span class='line-number'>6</span>
<span class='line-number'>7</span>
<span class='line-number'>8</span>
</pre></td><td class='code'><pre><code class='scala'><span class='line'><span class="n">scala</span><span class="o">&gt;</span> <span class="k">def</span> <span class="n">adder</span><span class="o">(</span><span class="n">x</span><span class="k">:</span><span class="kt">Int</span><span class="o">)</span> <span class="k">=</span> <span class="o">(</span><span class="n">y</span><span class="k">:</span><span class="kt">Int</span><span class="o">)</span> <span class="k">=&gt;</span> <span class="n">x</span> <span class="o">+</span> <span class="n">y</span>
</span><span class='line'><span class="n">adder</span><span class="k">:</span> <span class="o">(</span><span class="kt">x:</span> <span class="kt">Int</span><span class="o">)</span><span class="kt">Int</span> <span class="o">=&gt;</span> <span class="nc">Int</span>
</span><span class='line'>
</span><span class='line'><span class="n">scala</span><span class="o">&gt;</span> <span class="k">val</span> <span class="n">add3</span> <span class="k">=</span> <span class="n">adder</span><span class="o">(</span><span class="mi">3</span><span class="o">)</span>
</span><span class='line'><span class="n">add3</span><span class="k">:</span> <span class="kt">Int</span> <span class="o">=&gt;</span> <span class="nc">Int</span> <span class="k">=</span> <span class="o">&lt;</span><span class="n">function1</span><span class="o">&gt;</span>
</span><span class='line'>
</span><span class='line'><span class="n">scala</span><span class="o">&gt;</span> <span class="n">add3</span><span class="o">(</span><span class="mi">5</span><span class="o">)</span>
</span><span class='line'><span class="n">res14</span><span class="k">:</span> <span class="kt">Int</span> <span class="o">=</span> <span class="mi">8</span>
</span></code></pre></td></tr></table></div></figure>


<p>So where did it get 3 from? It remembered the value, or in functional terms it closed over the value when it was defined.</p>

<p>How is this handy? Let&rsquo;s use map and pass it our function:</p>

<figure class='code'><figcaption><span>scala-closure-map</span></figcaption><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
</pre></td><td class='code'><pre><code class='scala'><span class='line'><span class="n">scala</span><span class="o">&gt;</span> <span class="nc">List</span><span class="o">(</span><span class="mi">1</span><span class="o">,</span><span class="mi">2</span><span class="o">,</span><span class="mi">3</span><span class="o">,</span><span class="mi">4</span><span class="o">).</span><span class="n">map</span><span class="o">(</span><span class="n">add3</span><span class="o">)</span>
</span><span class='line'><span class="n">res15</span><span class="k">:</span> <span class="kt">List</span><span class="o">[</span><span class="kt">Int</span><span class="o">]</span> <span class="k">=</span> <span class="nc">List</span><span class="o">(</span><span class="mi">4</span><span class="o">,</span> <span class="mi">5</span><span class="o">,</span> <span class="mi">6</span><span class="o">,</span> <span class="mi">7</span><span class="o">)</span>
</span></code></pre></td></tr></table></div></figure>


<p>Let&rsquo;s take it one step further though and just use adder. After all, we might need a bit more flexibility there:</p>

<figure class='code'><figcaption><span>scala-closure-map-dynamic</span></figcaption><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
</pre></td><td class='code'><pre><code class='scala'><span class='line'><span class="n">scala</span><span class="o">&gt;</span> <span class="nc">List</span><span class="o">(</span><span class="mi">1</span><span class="o">,</span><span class="mi">2</span><span class="o">,</span><span class="mi">3</span><span class="o">,</span><span class="mi">4</span><span class="o">).</span><span class="n">map</span><span class="o">(</span><span class="n">adder</span><span class="o">(</span><span class="mi">5</span><span class="o">))</span>
</span><span class='line'><span class="n">res17</span><span class="k">:</span> <span class="kt">List</span><span class="o">[</span><span class="kt">Int</span><span class="o">]</span> <span class="k">=</span> <span class="nc">List</span><span class="o">(</span><span class="mi">6</span><span class="o">,</span> <span class="mi">7</span><span class="o">,</span> <span class="mi">8</span><span class="o">,</span> <span class="mi">9</span><span class="o">)</span>
</span></code></pre></td></tr></table></div></figure>


<p>So now we have a function which returns a function that gets used by map. To put it another way, let&rsquo;s have at filter:</p>

<figure class='code'><figcaption><span>scala-closure-filter</span></figcaption><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
<span class='line-number'>3</span>
<span class='line-number'>4</span>
<span class='line-number'>5</span>
</pre></td><td class='code'><pre><code class='scala'><span class='line'><span class="n">scala</span><span class="o">&gt;</span> <span class="k">def</span> <span class="n">divisibleBy</span><span class="o">(</span><span class="n">y</span><span class="k">:</span><span class="kt">Int</span><span class="o">)</span> <span class="k">=</span> <span class="o">(</span><span class="n">x</span><span class="k">:</span><span class="kt">Int</span><span class="o">)</span> <span class="k">=&gt;</span> <span class="n">x</span> <span class="o">%</span> <span class="n">y</span> <span class="o">==</span> <span class="mi">0</span>
</span><span class='line'><span class="n">divisibleBy</span><span class="k">:</span> <span class="o">(</span><span class="kt">y:</span> <span class="kt">Int</span><span class="o">)</span><span class="kt">Int</span> <span class="o">=&gt;</span> <span class="nc">Boolean</span>
</span><span class='line'>
</span><span class='line'><span class="n">scala</span><span class="o">&gt;</span> <span class="nc">List</span><span class="o">(</span><span class="mi">1</span><span class="o">,</span><span class="mi">2</span><span class="o">,</span><span class="mi">3</span><span class="o">,</span><span class="mi">4</span><span class="o">).</span><span class="n">filter</span><span class="o">(</span><span class="n">divisibleBy</span><span class="o">(</span><span class="mi">2</span><span class="o">))</span>
</span><span class='line'><span class="n">res18</span><span class="k">:</span> <span class="kt">List</span><span class="o">[</span><span class="kt">Int</span><span class="o">]</span> <span class="k">=</span> <span class="nc">List</span><span class="o">(</span><span class="mi">2</span><span class="o">,</span> <span class="mi">4</span><span class="o">)</span>
</span></code></pre></td></tr></table></div></figure>


<h2>Laziness can be good</h2>

<p>You might wonder when a concept like a lazy value might be handy. Let&rsquo;s say you need an infinite list, or stream, from which you have no clue how many elements you&rsquo;ll either need or get from it.</p>

<p>How about an infinite stream of Fibonacci numbers, straight from the Scala source code:</p>

<figure class='code'><figcaption><span>scala-lazy</span></figcaption><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
<span class='line-number'>3</span>
<span class='line-number'>4</span>
<span class='line-number'>5</span>
<span class='line-number'>6</span>
<span class='line-number'>7</span>
<span class='line-number'>8</span>
<span class='line-number'>9</span>
<span class='line-number'>10</span>
<span class='line-number'>11</span>
<span class='line-number'>12</span>
<span class='line-number'>13</span>
<span class='line-number'>14</span>
<span class='line-number'>15</span>
</pre></td><td class='code'><pre><code class='scala'><span class='line'><span class="n">scala</span><span class="o">&gt;</span> <span class="k">val</span> <span class="n">fibs</span><span class="k">:</span> <span class="kt">Stream</span><span class="o">[</span><span class="kt">BigInt</span><span class="o">]</span> <span class="k">=</span>
</span><span class='line'>  <span class="nc">BigInt</span><span class="o">(</span><span class="mi">0</span><span class="o">)</span> <span class="o">#::</span> <span class="nc">BigInt</span><span class="o">(</span><span class="mi">1</span><span class="o">)</span> <span class="o">#::</span> <span class="n">fibs</span><span class="o">.</span><span class="n">zip</span><span class="o">(</span><span class="n">fibs</span><span class="o">.</span><span class="n">tail</span><span class="o">).</span><span class="n">map</span> <span class="o">{</span> <span class="n">n</span> <span class="k">=&gt;</span> <span class="n">n</span><span class="o">.</span><span class="n">_1</span> <span class="o">+</span> <span class="n">n</span><span class="o">.</span><span class="n">_2</span> <span class="o">}</span>
</span><span class='line'><span class="n">fibs</span><span class="k">:</span> <span class="kt">Stream</span><span class="o">[</span><span class="kt">BigInt</span><span class="o">]</span> <span class="k">=</span> <span class="nc">Stream</span><span class="o">(</span><span class="mi">0</span><span class="o">,</span> <span class="o">?)</span>
</span><span class='line'>
</span><span class='line'><span class="n">scala</span><span class="o">&gt;</span> <span class="n">fibs</span><span class="o">.</span><span class="n">take</span><span class="o">(</span><span class="mi">10</span><span class="o">).</span><span class="n">foreach</span><span class="o">(</span><span class="n">println</span><span class="o">)</span>
</span><span class='line'><span class="mi">0</span>
</span><span class='line'><span class="mi">1</span>
</span><span class='line'><span class="mi">1</span>
</span><span class='line'><span class="mi">2</span>
</span><span class='line'><span class="mi">3</span>
</span><span class='line'><span class="mi">5</span>
</span><span class='line'><span class="mi">8</span>
</span><span class='line'><span class="mi">13</span>
</span><span class='line'><span class="mi">21</span>
</span><span class='line'><span class="mi">34</span>
</span></code></pre></td></tr></table></div></figure>


<p>That&rsquo;s a lot of code to digest, but it gives us an infinite stream of Fibonacci numbers. Let&rsquo;s break it apart a bit:</p>

<figure class='code'><figcaption><span>scala-lazy-explained</span></figcaption><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
<span class='line-number'>3</span>
<span class='line-number'>4</span>
<span class='line-number'>5</span>
<span class='line-number'>6</span>
<span class='line-number'>7</span>
<span class='line-number'>8</span>
<span class='line-number'>9</span>
<span class='line-number'>10</span>
<span class='line-number'>11</span>
<span class='line-number'>12</span>
<span class='line-number'>13</span>
<span class='line-number'>14</span>
<span class='line-number'>15</span>
<span class='line-number'>16</span>
<span class='line-number'>17</span>
<span class='line-number'>18</span>
<span class='line-number'>19</span>
<span class='line-number'>20</span>
<span class='line-number'>21</span>
<span class='line-number'>22</span>
<span class='line-number'>23</span>
<span class='line-number'>24</span>
<span class='line-number'>25</span>
<span class='line-number'>26</span>
<span class='line-number'>27</span>
<span class='line-number'>28</span>
<span class='line-number'>29</span>
<span class='line-number'>30</span>
<span class='line-number'>31</span>
</pre></td><td class='code'><pre><code class='scala'><span class='line'><span class="c1">// We&#39;re creating a new value that&#39;s a stream of BigInts</span>
</span><span class='line'><span class="k">val</span> <span class="n">fibs</span><span class="k">:</span> <span class="kt">Stream</span><span class="o">[</span><span class="kt">BigInt</span><span class="o">]</span> <span class="k">=</span>
</span><span class='line'>  <span class="c1">// Where the first value is zero, lazily concatenated with</span>
</span><span class='line'>  <span class="nc">BigInt</span><span class="o">(</span><span class="mi">0</span><span class="o">)</span> <span class="o">#::</span>
</span><span class='line'>  <span class="c1">// The second value, which is one, lazily concatenated with</span>
</span><span class='line'>  <span class="nc">BigInt</span><span class="o">(</span><span class="mi">1</span><span class="o">)</span> <span class="o">#::</span>
</span><span class='line'>  <span class="c1">// A function that takes the current fibonnaci numbers,</span>
</span><span class='line'>  <span class="c1">// zips them with their tail, and adds those pairs together</span>
</span><span class='line'>  <span class="n">fibs</span><span class="o">.</span><span class="n">zip</span><span class="o">(</span><span class="n">fibs</span><span class="o">.</span><span class="n">tail</span><span class="o">).</span><span class="n">map</span> <span class="o">{</span> <span class="n">n</span> <span class="k">=&gt;</span> <span class="n">n</span><span class="o">.</span><span class="n">_1</span> <span class="o">+</span> <span class="n">n</span><span class="o">.</span><span class="n">_2</span> <span class="o">}</span>
</span><span class='line'>
</span><span class='line'><span class="c1">// What are zip and tail?</span>
</span><span class='line'><span class="n">scala</span><span class="o">&gt;</span> <span class="k">val</span> <span class="n">zipList</span> <span class="k">=</span> <span class="nc">List</span><span class="o">(</span><span class="mi">1</span><span class="o">,</span><span class="mi">2</span><span class="o">,</span><span class="mi">3</span><span class="o">,</span><span class="mi">4</span><span class="o">)</span>
</span><span class='line'><span class="n">zipList</span><span class="k">:</span> <span class="kt">List</span><span class="o">[</span><span class="kt">Int</span><span class="o">]</span> <span class="k">=</span> <span class="nc">List</span><span class="o">(</span><span class="mi">1</span><span class="o">,</span> <span class="mi">2</span><span class="o">,</span> <span class="mi">3</span><span class="o">,</span> <span class="mi">4</span><span class="o">)</span>
</span><span class='line'>
</span><span class='line'><span class="c1">// Remember head? It gets the first element of our list</span>
</span><span class='line'><span class="n">scala</span><span class="o">&gt;</span> <span class="n">zipList</span><span class="o">.</span><span class="n">head</span>
</span><span class='line'><span class="n">res22</span><span class="k">:</span> <span class="kt">Int</span> <span class="o">=</span> <span class="mi">1</span>
</span><span class='line'>
</span><span class='line'><span class="c1">// Tail just gets the rest</span>
</span><span class='line'><span class="n">scala</span><span class="o">&gt;</span> <span class="n">zipList</span><span class="o">.</span><span class="n">tail</span>
</span><span class='line'><span class="n">res23</span><span class="k">:</span> <span class="kt">List</span><span class="o">[</span><span class="kt">Int</span><span class="o">]</span> <span class="k">=</span> <span class="nc">List</span><span class="o">(</span><span class="mi">2</span><span class="o">,</span> <span class="mi">3</span><span class="o">,</span> <span class="mi">4</span><span class="o">)</span>
</span><span class='line'>
</span><span class='line'><span class="c1">// It takes two lists and zips them together into tuple pairs</span>
</span><span class='line'><span class="n">scala</span><span class="o">&gt;</span> <span class="n">zipList</span><span class="o">.</span><span class="n">zip</span><span class="o">(</span><span class="n">zipList</span><span class="o">.</span><span class="n">tail</span><span class="o">)</span>
</span><span class='line'><span class="n">res24</span><span class="k">:</span> <span class="kt">List</span><span class="o">[(</span><span class="kt">Int</span>, <span class="kt">Int</span><span class="o">)]</span> <span class="k">=</span> <span class="nc">List</span><span class="o">((</span><span class="mi">1</span><span class="o">,</span><span class="mi">2</span><span class="o">),</span> <span class="o">(</span><span class="mi">2</span><span class="o">,</span><span class="mi">3</span><span class="o">),</span> <span class="o">(</span><span class="mi">3</span><span class="o">,</span><span class="mi">4</span><span class="o">))</span>
</span><span class='line'>
</span><span class='line'><span class="c1">// Now about that map function:</span>
</span><span class='line'><span class="c1">// map { n =&gt; n._1 + n._2 }</span>
</span><span class='line'><span class="c1">//</span>
</span><span class='line'><span class="c1">// That just adds the two elements of the tuple together. In Scala, _n is the</span>
</span><span class='line'><span class="c1">// nth element of the list, non-zero indexed</span>
</span></code></pre></td></tr></table></div></figure>


<h2>How is this relevant?</h2>

<p>Now that we have all these components, let&rsquo;s take another look at that wordcount example for Spark:</p>

<figure class='code'><figcaption><span>spark-wordcount-example</span><a href='https://spark.apache.org/examples.html'>link</a></figcaption><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
<span class='line-number'>3</span>
<span class='line-number'>4</span>
<span class='line-number'>5</span>
</pre></td><td class='code'><pre><code class='scala'><span class='line'><span class="k">val</span> <span class="n">textFile</span> <span class="k">=</span> <span class="n">spark</span><span class="o">.</span><span class="n">textFile</span><span class="o">(</span><span class="s">&quot;hdfs://...&quot;</span><span class="o">)</span>
</span><span class='line'><span class="k">val</span> <span class="n">counts</span> <span class="k">=</span> <span class="n">textFile</span><span class="o">.</span><span class="n">flatMap</span><span class="o">(</span><span class="n">line</span> <span class="k">=&gt;</span> <span class="n">line</span><span class="o">.</span><span class="n">split</span><span class="o">(</span><span class="s">&quot; &quot;</span><span class="o">))</span>
</span><span class='line'>                 <span class="o">.</span><span class="n">map</span><span class="o">(</span><span class="n">word</span> <span class="k">=&gt;</span> <span class="o">(</span><span class="n">word</span><span class="o">,</span> <span class="mi">1</span><span class="o">))</span>
</span><span class='line'>                 <span class="o">.</span><span class="n">reduceByKey</span><span class="o">(</span><span class="k">_</span> <span class="o">+</span> <span class="k">_</span><span class="o">)</span>
</span><span class='line'><span class="n">counts</span><span class="o">.</span><span class="n">saveAsTextFile</span><span class="o">(</span><span class="s">&quot;hdfs://...&quot;</span><span class="o">)</span>
</span></code></pre></td></tr></table></div></figure>


<p>Some of those look familiar? Let&rsquo;s dissect it a bit:</p>

<figure class='code'><figcaption><span>spark-wordcount-example-explained</span><a href='https://spark.apache.org/examples.html'>link</a></figcaption><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
<span class='line-number'>3</span>
<span class='line-number'>4</span>
<span class='line-number'>5</span>
<span class='line-number'>6</span>
<span class='line-number'>7</span>
<span class='line-number'>8</span>
<span class='line-number'>9</span>
<span class='line-number'>10</span>
<span class='line-number'>11</span>
<span class='line-number'>12</span>
<span class='line-number'>13</span>
<span class='line-number'>14</span>
<span class='line-number'>15</span>
<span class='line-number'>16</span>
<span class='line-number'>17</span>
<span class='line-number'>18</span>
<span class='line-number'>19</span>
<span class='line-number'>20</span>
<span class='line-number'>21</span>
<span class='line-number'>22</span>
<span class='line-number'>23</span>
<span class='line-number'>24</span>
<span class='line-number'>25</span>
<span class='line-number'>26</span>
<span class='line-number'>27</span>
<span class='line-number'>28</span>
<span class='line-number'>29</span>
<span class='line-number'>30</span>
<span class='line-number'>31</span>
<span class='line-number'>32</span>
<span class='line-number'>33</span>
<span class='line-number'>34</span>
<span class='line-number'>35</span>
</pre></td><td class='code'><pre><code class='scala'><span class='line'><span class="c1">// We&#39;re reading our document from HDFS, and storing it in textFile</span>
</span><span class='line'><span class="k">val</span> <span class="n">textFile</span> <span class="k">=</span> <span class="n">spark</span><span class="o">.</span><span class="n">textFile</span><span class="o">(</span><span class="s">&quot;hdfs://...&quot;</span><span class="o">)</span>
</span><span class='line'>
</span><span class='line'><span class="c1">// Now we&#39;re defining our pipeline - By the way, this is lazy</span>
</span><span class='line'><span class="k">val</span> <span class="n">counts</span> <span class="k">=</span>
</span><span class='line'>  <span class="n">textFile</span>
</span><span class='line'>    <span class="c1">// Flat map is very similar to map, except it flattens the results after it</span>
</span><span class='line'>    <span class="c1">// gets them (see flatmap below)</span>
</span><span class='line'>    <span class="c1">//</span>
</span><span class='line'>    <span class="c1">// What we&#39;re doing here is splitting each line by whitespace, and then</span>
</span><span class='line'>    <span class="c1">// flattening into one stream of words to go through</span>
</span><span class='line'>    <span class="o">.</span><span class="n">flatMap</span><span class="o">(</span><span class="n">line</span> <span class="k">=&gt;</span> <span class="n">line</span><span class="o">.</span><span class="n">split</span><span class="o">(</span><span class="s">&quot; &quot;</span><span class="o">))</span>
</span><span class='line'>    <span class="c1">// Then we&#39;re mapping all those words into a tuple, we&#39;ll see why in a</span>
</span><span class='line'>    <span class="c1">// second</span>
</span><span class='line'>    <span class="o">.</span><span class="n">map</span><span class="o">(</span><span class="n">word</span> <span class="k">=&gt;</span> <span class="o">(</span><span class="n">word</span><span class="o">,</span> <span class="mi">1</span><span class="o">))</span>
</span><span class='line'>    <span class="c1">// Reduce by key takes all similar keys and reduces the values with a</span>
</span><span class='line'>    <span class="c1">// function, in this case a sum</span>
</span><span class='line'>    <span class="o">.</span><span class="n">reduceByKey</span><span class="o">(</span><span class="k">_</span> <span class="o">+</span> <span class="k">_</span><span class="o">)</span>
</span><span class='line'>
</span><span class='line'><span class="c1">// Why the tuple and reduce by key? Normally you&#39;d use a groupBy operator here,</span>
</span><span class='line'><span class="c1">// but that does not parallelize cleanly.</span>
</span><span class='line'><span class="c1">//</span>
</span><span class='line'><span class="c1">// What we do to compensate here is make tuples so that we can send specific</span>
</span><span class='line'><span class="c1">// words to different partitions to be reduced</span>
</span><span class='line'>
</span><span class='line'><span class="c1">// NOW the pipeline gets called, as we want a value out of it. In this case it</span>
</span><span class='line'><span class="c1">// saves a new text file</span>
</span><span class='line'><span class="n">counts</span><span class="o">.</span><span class="n">saveAsTextFile</span><span class="o">(</span><span class="s">&quot;hdfs://...&quot;</span><span class="o">)</span>
</span><span class='line'>
</span><span class='line'><span class="c1">// Flat Map</span>
</span><span class='line'><span class="n">scala</span><span class="o">&gt;</span> <span class="nc">List</span><span class="o">(</span><span class="nc">List</span><span class="o">(</span><span class="mi">1</span><span class="o">,</span><span class="mi">2</span><span class="o">,</span><span class="mi">3</span><span class="o">),</span> <span class="nc">List</span><span class="o">(</span><span class="mi">2</span><span class="o">,</span><span class="mi">3</span><span class="o">,</span><span class="mi">4</span><span class="o">)).</span><span class="n">map</span><span class="o">(</span><span class="n">list</span> <span class="k">=&gt;</span> <span class="n">list</span><span class="o">.</span><span class="n">map</span><span class="o">(</span><span class="n">adder</span><span class="o">(</span><span class="mi">2</span><span class="o">)))</span>
</span><span class='line'><span class="n">res27</span><span class="k">:</span> <span class="kt">List</span><span class="o">[</span><span class="kt">List</span><span class="o">[</span><span class="kt">Int</span><span class="o">]]</span> <span class="k">=</span> <span class="nc">List</span><span class="o">(</span><span class="nc">List</span><span class="o">(</span><span class="mi">3</span><span class="o">,</span> <span class="mi">4</span><span class="o">,</span> <span class="mi">5</span><span class="o">),</span> <span class="nc">List</span><span class="o">(</span><span class="mi">4</span><span class="o">,</span> <span class="mi">5</span><span class="o">,</span> <span class="mi">6</span><span class="o">))</span>
</span><span class='line'>
</span><span class='line'><span class="n">scala</span><span class="o">&gt;</span> <span class="nc">List</span><span class="o">(</span><span class="nc">List</span><span class="o">(</span><span class="mi">1</span><span class="o">,</span><span class="mi">2</span><span class="o">,</span><span class="mi">3</span><span class="o">),</span> <span class="nc">List</span><span class="o">(</span><span class="mi">2</span><span class="o">,</span><span class="mi">3</span><span class="o">,</span><span class="mi">4</span><span class="o">)).</span><span class="n">flatMap</span><span class="o">(</span><span class="n">list</span> <span class="k">=&gt;</span> <span class="n">list</span><span class="o">.</span><span class="n">map</span><span class="o">(</span><span class="n">adder</span><span class="o">(</span><span class="mi">2</span><span class="o">)))</span>
</span><span class='line'><span class="n">res28</span><span class="k">:</span> <span class="kt">List</span><span class="o">[</span><span class="kt">Int</span><span class="o">]</span> <span class="k">=</span> <span class="nc">List</span><span class="o">(</span><span class="mi">3</span><span class="o">,</span> <span class="mi">4</span><span class="o">,</span> <span class="mi">5</span><span class="o">,</span> <span class="mi">4</span><span class="o">,</span> <span class="mi">5</span><span class="o">,</span> <span class="mi">6</span><span class="o">)</span>
</span></code></pre></td></tr></table></div></figure>


<p>Now there&rsquo;s a lot more to Spark than this, but now you&rsquo;ve got a grounding by which you can build on. Next we&rsquo;ll be looking more into Spark specifically.</p>
]]></content>
  </entry>
  
  <entry>
    <title type="html"><![CDATA[Sublime Scoping with Rails]]></title>
    <link href="http://www.baweaver.com/blog/2015/05/04/sublime-scoping-with-rails/"/>
    <updated>2015-05-04T22:21:12-07:00</updated>
    <id>http://www.baweaver.com/blog/2015/05/04/sublime-scoping-with-rails</id>
    <content type="html"><![CDATA[<p>Even the most ardent adherent of skinny controllers will find themselves plagued by the ferocious number of filters demanded for any non-trivial search on their models. Given enough attributes, you&rsquo;ll notice your controller starting to look a little hairy</p>

<!-- more -->




<figure class='code'><figcaption><span></span></figcaption><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
<span class='line-number'>3</span>
<span class='line-number'>4</span>
<span class='line-number'>5</span>
<span class='line-number'>6</span>
<span class='line-number'>7</span>
</pre></td><td class='code'><pre><code class='ruby'><span class='line'><span class="k">class</span> <span class="nc">PeopleController</span>
</span><span class='line'>  <span class="k">def</span> <span class="nf">index</span>
</span><span class='line'>    <span class="vi">@people</span> <span class="o">=</span> <span class="no">Person</span><span class="o">.</span><span class="n">where</span><span class="p">(</span><span class="nb">name</span><span class="p">:</span> <span class="n">params</span><span class="o">[</span><span class="ss">:name</span><span class="o">]</span><span class="p">)</span> <span class="k">if</span> <span class="n">params</span><span class="o">[</span><span class="ss">:name</span><span class="o">]</span>
</span><span class='line'>    <span class="vi">@people</span> <span class="o">=</span> <span class="vi">@people</span><span class="o">.</span><span class="n">where</span><span class="p">(</span><span class="ss">birthday</span><span class="p">:</span> <span class="n">params</span><span class="o">[</span><span class="ss">:birthday_start</span><span class="o">].</span><span class="n">.params</span><span class="o">[</span><span class="ss">:birthday_end</span><span class="o">]</span><span class="p">)</span> <span class="k">if</span> <span class="n">params</span><span class="o">[</span><span class="ss">:birthday_start</span><span class="o">]</span> <span class="o">&amp;&amp;</span> <span class="n">params</span><span class="o">[</span><span class="ss">:birthday_end</span><span class="o">]</span>
</span><span class='line'>    <span class="vi">@people</span> <span class="o">=</span> <span class="vi">@people</span><span class="o">.</span><span class="n">where</span><span class="p">(</span><span class="ss">sex</span><span class="p">:</span> <span class="n">params</span><span class="o">[</span><span class="ss">:sex</span><span class="o">]</span><span class="p">)</span> <span class="k">if</span> <span class="n">params</span><span class="o">[</span><span class="ss">:sex</span><span class="o">]</span>
</span><span class='line'>  <span class="k">end</span>
</span><span class='line'><span class="k">end</span>
</span></code></pre></td></tr></table></div></figure>


<p>The horrifying trend will only continue as our demand for searching power grows, which begs the question: How can we tame this mess?</p>

<h2>Strong Params</h2>

<p>Your first line of defense against this will be using strong params to your advantage. They&rsquo;re not only for creating objects.</p>

<p>Let&rsquo;s try something out in the console:</p>

<figure class='code'><figcaption><span></span></figcaption><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
<span class='line-number'>3</span>
<span class='line-number'>4</span>
<span class='line-number'>5</span>
</pre></td><td class='code'><pre><code class='ruby'><span class='line'><span class="o">[</span><span class="mi">1</span><span class="o">]</span> <span class="n">pry</span><span class="p">(</span><span class="n">main</span><span class="p">)</span><span class="o">&gt;</span> <span class="no">ActionController</span><span class="o">::</span><span class="no">Parameters</span><span class="o">.</span><span class="n">new</span><span class="p">({</span><span class="ss">a</span><span class="p">:</span> <span class="mi">1</span><span class="p">,</span> <span class="ss">b</span><span class="p">:</span> <span class="mi">2</span><span class="p">})</span>
</span><span class='line'><span class="o">=&gt;</span> <span class="p">{</span><span class="s2">&quot;a&quot;</span><span class="o">=&gt;</span><span class="mi">1</span><span class="p">,</span> <span class="s2">&quot;b&quot;</span><span class="o">=&gt;</span><span class="mi">2</span><span class="p">}</span>
</span><span class='line'><span class="o">[</span><span class="mi">2</span><span class="o">]</span> <span class="n">pry</span><span class="p">(</span><span class="n">main</span><span class="p">)</span><span class="o">&gt;</span> <span class="n">_</span><span class="o">.</span><span class="n">permit</span><span class="p">(</span><span class="ss">:a</span><span class="p">)</span>
</span><span class='line'><span class="no">Unpermitted</span> <span class="ss">parameter</span><span class="p">:</span> <span class="n">b</span>
</span><span class='line'><span class="o">=&gt;</span> <span class="p">{</span><span class="s2">&quot;a&quot;</span><span class="o">=&gt;</span><span class="mi">1</span><span class="p">}</span>
</span></code></pre></td></tr></table></div></figure>


<p>So by using permit on our parameters object, we can filter down a hash to only our permitted values. So what if we did something like this?</p>

<figure class='code'><figcaption><span></span></figcaption><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
<span class='line-number'>3</span>
<span class='line-number'>4</span>
<span class='line-number'>5</span>
<span class='line-number'>6</span>
</pre></td><td class='code'><pre><code class='ruby'><span class='line'><span class="k">class</span> <span class="nc">PeopleController</span>
</span><span class='line'>  <span class="k">def</span> <span class="nf">index</span>
</span><span class='line'>    <span class="vi">@people</span> <span class="o">=</span> <span class="no">Person</span><span class="o">.</span><span class="n">where</span><span class="p">(</span><span class="n">params</span><span class="o">.</span><span class="n">permit</span><span class="p">(</span><span class="ss">:name</span><span class="p">,</span> <span class="ss">:sex</span><span class="p">))</span>
</span><span class='line'>    <span class="vi">@people</span> <span class="o">=</span> <span class="vi">@people</span><span class="o">.</span><span class="n">where</span><span class="p">(</span><span class="ss">birthday</span><span class="p">:</span> <span class="n">params</span><span class="o">[</span><span class="ss">:birthday_start</span><span class="o">].</span><span class="n">.params</span><span class="o">[</span><span class="ss">:birthday_end</span><span class="o">]</span><span class="p">)</span> <span class="k">if</span> <span class="n">params</span><span class="o">[</span><span class="ss">:birthday_start</span><span class="o">]</span> <span class="o">&amp;&amp;</span> <span class="n">params</span><span class="o">[</span><span class="ss">:birthday_end</span><span class="o">]</span>
</span><span class='line'>  <span class="k">end</span>
</span><span class='line'><span class="k">end</span>
</span></code></pre></td></tr></table></div></figure>


<p>With that we&rsquo;ve already cleaned out a lot of the cruft of our controller, but what about that last one?</p>

<h2>Scoping and Class Methods</h2>

<p>We can get rid of it as well, using either scoping or class methods to take care of it for us:</p>

<figure class='code'><figcaption><span></span></figcaption><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
<span class='line-number'>3</span>
<span class='line-number'>4</span>
<span class='line-number'>5</span>
<span class='line-number'>6</span>
<span class='line-number'>7</span>
<span class='line-number'>8</span>
<span class='line-number'>9</span>
</pre></td><td class='code'><pre><code class='ruby'><span class='line'><span class="k">class</span> <span class="nc">Person</span>
</span><span class='line'>  <span class="c1"># We can go with a scope:</span>
</span><span class='line'>  <span class="n">scope</span> <span class="ss">:born_between</span><span class="p">,</span> <span class="o">-&gt;</span> <span class="n">start</span><span class="p">,</span> <span class="k">end</span> <span class="p">{</span> <span class="n">where</span><span class="p">(</span><span class="ss">age</span><span class="p">:</span> <span class="n">start</span><span class="o">.</span><span class="n">.</span><span class="k">end</span><span class="p">)</span> <span class="p">}</span>
</span><span class='line'>
</span><span class='line'>  <span class="c1"># ...or a class method:</span>
</span><span class='line'>  <span class="k">def</span> <span class="nc">self</span><span class="o">.</span><span class="nf">born_between</span><span class="p">(</span><span class="n">start</span><span class="p">,</span> <span class="k">end</span><span class="p">)</span>
</span><span class='line'>    <span class="n">where</span><span class="p">(</span><span class="ss">age</span><span class="p">:</span> <span class="n">start</span><span class="o">.</span><span class="n">.</span><span class="k">end</span><span class="p">)</span>
</span><span class='line'>  <span class="k">end</span>
</span><span class='line'><span class="k">end</span>
</span></code></pre></td></tr></table></div></figure>


<p>Which will let us trim down our controller even a little more here:</p>

<figure class='code'><figcaption><span></span></figcaption><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
<span class='line-number'>3</span>
<span class='line-number'>4</span>
<span class='line-number'>5</span>
</pre></td><td class='code'><pre><code class='ruby'><span class='line'><span class="k">class</span> <span class="nc">PeopleController</span>
</span><span class='line'>  <span class="k">def</span> <span class="nf">index</span>
</span><span class='line'>    <span class="vi">@people</span> <span class="o">=</span> <span class="no">Person</span><span class="o">.</span><span class="n">where</span><span class="p">(</span><span class="n">params</span><span class="o">.</span><span class="n">permit</span><span class="p">(</span><span class="ss">:name</span><span class="p">,</span> <span class="ss">:sex</span><span class="p">))</span><span class="o">.</span><span class="n">born_between</span><span class="p">(</span><span class="n">params</span><span class="o">[</span><span class="ss">:age_start</span><span class="o">]</span><span class="p">,</span> <span class="n">params</span><span class="o">[</span><span class="ss">:age_end</span><span class="o">]</span><span class="p">)</span>
</span><span class='line'>  <span class="k">end</span>
</span><span class='line'><span class="k">end</span>
</span></code></pre></td></tr></table></div></figure>


<h2>Conditional Scoping</h2>

<p>The astute reader will note that the above method is going to fail gloriously should we forget either of those params. We could always drop it to another variable and mutate people, but that&rsquo;s generally frowned upon and doesn&rsquo;t normally produce superheroes.</p>

<p>What we can do, however, is introduce a more conditional scoping method. Class methods are, after all, ruby methods. Let&rsquo;s use them to their potential a bit more:</p>

<figure class='code'><figcaption><span></span></figcaption><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
<span class='line-number'>3</span>
<span class='line-number'>4</span>
<span class='line-number'>5</span>
</pre></td><td class='code'><pre><code class='ruby'><span class='line'><span class="k">class</span> <span class="nc">Person</span>
</span><span class='line'>  <span class="k">def</span> <span class="nc">self</span><span class="o">.</span><span class="nf">born_between</span><span class="p">(</span><span class="n">start</span><span class="p">,</span> <span class="k">end</span> <span class="o">=</span> <span class="no">Time</span><span class="o">.</span><span class="n">now</span><span class="p">)</span>
</span><span class='line'>    <span class="n">start</span> <span class="p">?</span> <span class="n">where</span><span class="p">(</span><span class="ss">age</span><span class="p">:</span> <span class="n">start</span><span class="o">.</span><span class="n">.</span><span class="k">end</span><span class="p">)</span> <span class="p">:</span> <span class="n">all</span>
</span><span class='line'>  <span class="k">end</span>
</span><span class='line'><span class="k">end</span>
</span></code></pre></td></tr></table></div></figure>


<p>By throwing in an <code>all</code>, we can conditionally chain freely.</p>

<h2>Like Scoping</h2>

<p>The problem is, that name search just isn&rsquo;t doing it for us. We don&rsquo;t want to break out <code>solr</code> or <code>trigrams</code> quite yet, but we can use some <code>like</code> queries to make it a bit more flexible:</p>

<figure class='code'><figcaption><span></span></figcaption><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
<span class='line-number'>3</span>
<span class='line-number'>4</span>
<span class='line-number'>5</span>
<span class='line-number'>6</span>
</pre></td><td class='code'><pre><code class='ruby'><span class='line'><span class="k">class</span> <span class="nc">PeopleController</span>
</span><span class='line'>  <span class="k">def</span> <span class="nf">index</span>
</span><span class='line'>    <span class="vi">@people</span> <span class="o">=</span> <span class="no">Person</span><span class="o">.</span><span class="n">where</span><span class="p">(</span><span class="n">params</span><span class="o">.</span><span class="n">permit</span><span class="p">(</span><span class="ss">:sex</span><span class="p">))</span><span class="o">.</span><span class="n">born_between</span><span class="p">(</span><span class="n">params</span><span class="o">[</span><span class="ss">:age_start</span><span class="o">]</span><span class="p">,</span> <span class="n">params</span><span class="o">[</span><span class="ss">:age_end</span><span class="o">]</span><span class="p">)</span>
</span><span class='line'>    <span class="vi">@people</span> <span class="o">=</span> <span class="vi">@people</span><span class="o">.</span><span class="n">where</span><span class="p">(</span><span class="s1">&#39;name LIKE ?&#39;</span><span class="p">,</span> <span class="n">params</span><span class="o">[</span><span class="ss">:name</span><span class="o">]</span><span class="p">)</span> <span class="k">if</span> <span class="n">params</span><span class="o">[</span><span class="ss">:name</span><span class="o">]</span>
</span><span class='line'>  <span class="k">end</span>
</span><span class='line'><span class="k">end</span>
</span></code></pre></td></tr></table></div></figure>


<p>Though we spend all that time getting rid of postfix <code>if</code> checks, can we do something about this one as well?</p>

<figure class='code'><figcaption><span></span></figcaption><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
<span class='line-number'>3</span>
<span class='line-number'>4</span>
<span class='line-number'>5</span>
<span class='line-number'>6</span>
<span class='line-number'>7</span>
<span class='line-number'>8</span>
<span class='line-number'>9</span>
</pre></td><td class='code'><pre><code class='ruby'><span class='line'><span class="k">class</span> <span class="nc">Person</span>
</span><span class='line'>  <span class="k">def</span> <span class="nc">self</span><span class="o">.</span><span class="nf">where_name_like</span><span class="p">(</span><span class="nb">name</span><span class="p">)</span>
</span><span class='line'>    <span class="nb">name</span> <span class="p">?</span> <span class="n">where</span><span class="p">(</span><span class="s1">&#39;name LIKE ?&#39;</span><span class="p">,</span> <span class="nb">name</span><span class="p">)</span> <span class="p">:</span> <span class="n">all</span>
</span><span class='line'>  <span class="k">end</span>
</span><span class='line'>
</span><span class='line'>  <span class="k">def</span> <span class="nc">self</span><span class="o">.</span><span class="nf">born_between</span><span class="p">(</span><span class="n">start_date</span><span class="p">,</span> <span class="n">end_date</span> <span class="o">=</span> <span class="no">Time</span><span class="o">.</span><span class="n">now</span><span class="p">)</span>
</span><span class='line'>    <span class="n">start_date</span> <span class="p">?</span> <span class="n">where</span><span class="p">(</span><span class="ss">age</span><span class="p">:</span> <span class="n">start_date</span><span class="o">.</span><span class="n">.end_date</span><span class="p">)</span> <span class="p">:</span> <span class="n">all</span>
</span><span class='line'>  <span class="k">end</span>
</span><span class='line'><span class="k">end</span>
</span></code></pre></td></tr></table></div></figure>


<p>That we can:</p>

<figure class='code'><figcaption><span></span></figcaption><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
<span class='line-number'>3</span>
<span class='line-number'>4</span>
<span class='line-number'>5</span>
<span class='line-number'>6</span>
<span class='line-number'>7</span>
<span class='line-number'>8</span>
<span class='line-number'>9</span>
</pre></td><td class='code'><pre><code class='ruby'><span class='line'><span class="k">class</span> <span class="nc">PeopleController</span>
</span><span class='line'>  <span class="k">def</span> <span class="nf">index</span>
</span><span class='line'>    <span class="vi">@people</span> <span class="o">=</span>
</span><span class='line'>      <span class="no">Person</span>
</span><span class='line'>        <span class="o">.</span><span class="n">where</span><span class="p">(</span><span class="n">params</span><span class="o">.</span><span class="n">permit</span><span class="p">(</span><span class="ss">:sex</span><span class="p">))</span>
</span><span class='line'>        <span class="o">.</span><span class="n">born_between</span><span class="p">(</span><span class="n">params</span><span class="o">[</span><span class="ss">:age_start</span><span class="o">]</span><span class="p">,</span> <span class="n">params</span><span class="o">[</span><span class="ss">:age_end</span><span class="o">]</span><span class="p">)</span>
</span><span class='line'>        <span class="o">.</span><span class="n">where_name_like</span><span class="p">(</span><span class="n">params</span><span class="o">[</span><span class="ss">:name</span><span class="o">]</span><span class="p">)</span>
</span><span class='line'>  <span class="k">end</span>
</span><span class='line'><span class="k">end</span>
</span></code></pre></td></tr></table></div></figure>


<p>Like that, we&rsquo;ve eliminated another suffix if.</p>

<h2>More advanced filtering</h2>

<p>Though most of these examples have been fairly straightforward, there will be times when you have to break out some joins and other operations depending on your parameters. Strong params aren&rsquo;t going to cut it on those, but class methods just might do the trick.</p>

<p>We have a new model to work with, <code>Post</code>, and with it the following controller:</p>

<figure class='code'><figcaption><span></span></figcaption><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
<span class='line-number'>3</span>
<span class='line-number'>4</span>
<span class='line-number'>5</span>
<span class='line-number'>6</span>
<span class='line-number'>7</span>
<span class='line-number'>8</span>
</pre></td><td class='code'><pre><code class='ruby'><span class='line'><span class="k">class</span> <span class="nc">PostsController</span>
</span><span class='line'>  <span class="k">def</span> <span class="nf">index</span>
</span><span class='line'>    <span class="vi">@posts</span> <span class="o">=</span> <span class="no">Post</span><span class="o">.</span><span class="n">where</span><span class="p">(</span><span class="n">params</span><span class="o">.</span><span class="n">permit</span><span class="p">(</span><span class="ss">:name</span><span class="p">))</span>
</span><span class='line'>    <span class="vi">@posts</span> <span class="o">=</span> <span class="vi">@posts</span><span class="o">.</span><span class="n">join</span><span class="p">(</span><span class="ss">:users</span><span class="p">)</span><span class="o">.</span><span class="n">where</span><span class="p">(</span><span class="ss">users</span><span class="p">:</span> <span class="p">{</span><span class="nb">id</span><span class="p">:</span> <span class="n">params</span><span class="o">[</span><span class="ss">:user_id</span><span class="o">]</span><span class="p">})</span> <span class="k">if</span> <span class="n">params</span><span class="o">[</span><span class="ss">:user_id</span><span class="o">]</span>
</span><span class='line'>    <span class="vi">@posts</span> <span class="o">=</span> <span class="vi">@posts</span><span class="o">.</span><span class="n">includes</span><span class="p">(</span><span class="ss">:comments</span><span class="p">)</span> <span class="k">if</span> <span class="n">params</span><span class="o">[</span><span class="ss">:show_comments</span><span class="o">]</span>
</span><span class='line'>    <span class="vi">@posts</span> <span class="o">=</span> <span class="vi">@posts</span><span class="o">.</span><span class="n">includes</span><span class="p">(</span><span class="ss">:tags</span><span class="p">)</span><span class="o">.</span><span class="n">where</span><span class="p">(</span><span class="ss">tag</span><span class="p">:</span> <span class="p">{</span><span class="nb">name</span><span class="p">:</span> <span class="no">JSON</span><span class="o">.</span><span class="n">parse</span><span class="p">(</span><span class="n">params</span><span class="o">[</span><span class="ss">:tags</span><span class="o">]</span><span class="p">)})</span> <span class="k">if</span> <span class="n">params</span><span class="o">[</span><span class="ss">:tags</span><span class="o">]</span>
</span><span class='line'>  <span class="k">end</span>
</span><span class='line'><span class="k">end</span>
</span></code></pre></td></tr></table></div></figure>


<p>Some of those earlier techniques just aren&rsquo;t going to cut it, and it&rsquo;s going to be a lot more difficult to be intention revealing here. Including comments and tags unless we have to could be a big expense, so we need to keep those under conditionals to prevent unnecessary data from being fetched.</p>

<p>We&rsquo;re going to have to use something new here. Let&rsquo;s condense those conditionals into a scope:</p>

<figure class='code'><figcaption><span></span></figcaption><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
<span class='line-number'>3</span>
<span class='line-number'>4</span>
<span class='line-number'>5</span>
<span class='line-number'>6</span>
<span class='line-number'>7</span>
<span class='line-number'>8</span>
<span class='line-number'>9</span>
<span class='line-number'>10</span>
<span class='line-number'>11</span>
<span class='line-number'>12</span>
<span class='line-number'>13</span>
</pre></td><td class='code'><pre><code class='ruby'><span class='line'><span class="k">class</span> <span class="nc">Post</span>
</span><span class='line'>  <span class="k">def</span> <span class="nc">self</span><span class="o">.</span><span class="nf">by_user</span><span class="p">(</span><span class="n">args</span> <span class="o">=</span> <span class="p">{})</span>
</span><span class='line'>    <span class="n">args</span><span class="o">[</span><span class="ss">:if</span><span class="o">]</span> <span class="p">?</span> <span class="n">join</span><span class="p">(</span><span class="ss">:users</span><span class="p">)</span><span class="o">.</span><span class="n">where</span><span class="p">(</span><span class="ss">users</span><span class="p">:</span> <span class="p">{</span><span class="nb">id</span><span class="p">:</span> <span class="n">args</span><span class="o">[</span><span class="ss">:if</span><span class="o">]</span><span class="p">})</span> <span class="p">:</span> <span class="n">all</span>
</span><span class='line'>  <span class="k">end</span>
</span><span class='line'>
</span><span class='line'>  <span class="k">def</span> <span class="nc">self</span><span class="o">.</span><span class="nf">with_comments</span><span class="p">(</span><span class="n">args</span> <span class="o">=</span> <span class="p">{})</span>
</span><span class='line'>    <span class="n">args</span><span class="o">[</span><span class="ss">:if</span><span class="o">]</span> <span class="p">?</span> <span class="n">includes</span><span class="p">(</span><span class="ss">:comments</span><span class="p">)</span> <span class="p">:</span> <span class="n">all</span>
</span><span class='line'>  <span class="k">end</span>
</span><span class='line'>
</span><span class='line'>  <span class="k">def</span> <span class="nc">self</span><span class="o">.</span><span class="nf">with_tags</span><span class="p">(</span><span class="n">args</span> <span class="o">=</span> <span class="p">{})</span>
</span><span class='line'>    <span class="n">args</span><span class="o">[</span><span class="ss">:if</span><span class="o">]</span> <span class="p">?</span> <span class="n">includes</span><span class="p">(</span><span class="ss">:tags</span><span class="p">)</span><span class="o">.</span><span class="n">where</span><span class="p">(</span><span class="ss">tag</span><span class="p">:</span> <span class="p">{</span><span class="nb">name</span><span class="p">:</span> <span class="no">JSON</span><span class="o">.</span><span class="n">parse</span><span class="p">(</span><span class="n">args</span><span class="o">[</span><span class="ss">:if</span><span class="o">]</span><span class="p">)})</span> <span class="p">:</span> <span class="n">all</span>
</span><span class='line'>  <span class="k">end</span>
</span><span class='line'><span class="k">end</span>
</span></code></pre></td></tr></table></div></figure>


<p>Noted that keyword arguments would be very unhappy with us using <code>if</code> there, making it a no-go.</p>

<p>Which allows us to write a much clearer controller:</p>

<figure class='code'><figcaption><span></span></figcaption><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
<span class='line-number'>3</span>
<span class='line-number'>4</span>
<span class='line-number'>5</span>
<span class='line-number'>6</span>
<span class='line-number'>7</span>
<span class='line-number'>8</span>
<span class='line-number'>9</span>
<span class='line-number'>10</span>
</pre></td><td class='code'><pre><code class='ruby'><span class='line'><span class="k">class</span> <span class="nc">PostsController</span>
</span><span class='line'>  <span class="k">def</span> <span class="nf">index</span>
</span><span class='line'>    <span class="vi">@posts</span> <span class="o">=</span>
</span><span class='line'>      <span class="no">Post</span>
</span><span class='line'>        <span class="o">.</span><span class="n">where</span><span class="p">(</span><span class="n">params</span><span class="o">.</span><span class="n">permit</span><span class="p">(</span><span class="ss">:name</span><span class="p">))</span>
</span><span class='line'>        <span class="o">.</span><span class="n">by_user</span><span class="p">(</span><span class="k">if</span><span class="p">:</span> <span class="n">params</span><span class="o">[</span><span class="ss">:user_id</span><span class="o">]</span><span class="p">)</span>
</span><span class='line'>        <span class="o">.</span><span class="n">with_comments</span><span class="p">(</span><span class="k">if</span><span class="p">:</span> <span class="n">params</span><span class="o">[</span><span class="ss">:show_comments</span><span class="o">]</span><span class="p">)</span>
</span><span class='line'>        <span class="o">.</span><span class="n">with_tags</span><span class="p">(</span><span class="k">if</span><span class="p">:</span> <span class="n">params</span><span class="o">[</span><span class="ss">:tags</span><span class="o">]</span><span class="p">)</span>
</span><span class='line'>  <span class="k">end</span>
</span><span class='line'><span class="k">end</span>
</span></code></pre></td></tr></table></div></figure>


<p>Through just a few simple scoping mechanisms, we can trim down our controllers while still getting a very useful search from vanilla rails.</p>
]]></content>
  </entry>
  
  <entry>
    <title type="html"><![CDATA[The Impersonal Interview]]></title>
    <link href="http://www.baweaver.com/blog/2015/03/22/the-impersonal-interview/"/>
    <updated>2015-03-22T20:21:38-07:00</updated>
    <id>http://www.baweaver.com/blog/2015/03/22/the-impersonal-interview</id>
    <content type="html"><![CDATA[<p>It&rsquo;s been said <a href="http://erniemiller.org/2013/09/19/interviews-are-broken/">time</a> <a href="https://www.linkedin.com/pulse/20140625202040-56760691-your-technical-interview-is-broken">and</a> <a href="https://medium.com/backchannel/the-way-we-hire-is-all-wrong-3e19e2051f3e">time</a> again, our technical interview process is broken. Lately, it&rsquo;s become fashionable to attack the interview process, but little seems to be done in regards to it. This is my opinion on the matter.</p>

<!-- more -->


<h2>IQ and GPA are Irrelevant</h2>

<p>Google famously came out saying that GPA was <a href="http://dailycaller.com/2013/06/20/google-executive-gpa-test-scores-worthless-for-hiring/">worthless</a>. Trick problems and brain teasers were doing little to no good in revealing good engineers.</p>

<p>In an industry where it&rsquo;s borderline impossible to establish reliable metrics to programmers skills, is it any wonder that interviews backfire? Despite this, we&rsquo;re trying to use one metric in particular to measure our coders, and it&rsquo;s doing a great deal of harm to the industry: memorization.</p>

<h2>Standardized Testing in Schools</h2>

<p>When confronted with the idea of standardized testing, teachers I&rsquo;ve spoken to have been quick to say that they are a poor measure of students. Some kids just don&rsquo;t learn by memorization and test like that, potentially brilliant young people are barred because they can&rsquo;t memorize a fact sheet. Why should they?</p>

<p>What bothered me in school was that I was expected to memorize a bunch of information that was literally inches from me, either via internet or the text book. Sure, I could memorize the quadratic formula, but what would be the point? Memory is faulty, but being able to automate or make reference of something is true value.</p>

<h2>We&rsquo;re testing rote memorization</h2>

<p>If this is failing so badly in our schools, why are we applying the same principles to coding interviews?</p>

<p>It&rsquo;s not uncommon to have pre-interview filters ask manual page questions, system internals, or things in general that would take no more than a second for someone to find in reference. Instead, we expect them to have an instant answer to these questions when rarely do coders actually have such information memorized.</p>

<p>The amount of false-negatives here is staggering, especially for cross-job interviews such as a developer seeking devops jobs or vice-versa. Of course a developer won&rsquo;t have systems knowledge memorized, and likewise an administrator probably won&rsquo;t have the entire Skiena&rsquo;s book of algorithms memorized. Does this make them incapable of the job? Hardly.</p>

<p>We&rsquo;re not studying for a test. We&rsquo;re trying to show that we have what it takes to innovate, to build. Memory is an abhorrent measure of aptitude. If GPA is already pegged for this, we should be doing away with rote memorization in much the same manner.</p>

<h2>A Cache System</h2>

<p>An engineer&rsquo;s true power lies not in memorization. If anything, I would argue that it&rsquo;s a severe weakness to have an engineer who insists on only memorizing large sums of information. It&rsquo;s inefficient. Much like a computer, only information that is immediately relevant should be cached. That&rsquo;s why we have references and man pages, we&rsquo;ve relegated the information to a metaphorical hard drive for later recovery when it&rsquo;s needed.</p>

<p>Here are a few examples:</p>

<ul>
<li>Instead of memorizing a system, dictate it into reference.</li>
<li>Instead of memorizing a deployment process, automate it.</li>
<li>Instead of memorizing esoteric language behaviors, write hooks in your editor and VCS to catch them</li>
<li>Instead of memorizing how to manually get system uptime and kernel information, write a tool to fetch it and return relevant information</li>
</ul>


<p>The list goes on.</p>

<p>An experienced engineer has a lot of knowledge to draw on from past jobs. Chances are they&rsquo;ve probably forgotten more than a more junior engineer has claimed to have memorized. Does this make them less valuable? Hardly. They recognize that it&rsquo;s sometimes necessary not to attempt to know everything up front.</p>

<p>Now granted that an experienced engineer is going to be far more effective in finding the correct references and information. It makes a world of difference and can really speak to the skill level of a person:</p>

<blockquote class="twitter-tweet" lang="en"><p>the older I get, the more convinced I am that the key qualities of a senior engineer are research skills and leveraging past experience <a href="https://twitter.com/hashtag/fb?src=hash">#fb</a></p>&mdash; Scott Francis ن (@darkuncle) <a href="https://twitter.com/darkuncle/status/581509216893960192">March 27, 2015</a></blockquote>


<script async src="//platform.twitter.com/widgets.js" charset="utf-8"></script>


<p>Who would you rather have working for you? Someone who memorizes your entire deployment process to a T, or someone who automates the entire thing so anyone can do it with a click? I would argue that the latter is exponentially more valuable to a team.</p>

<h2>Hit By a Bus Effect</h2>

<p>The weakness in memorization is that if only one person becomes a bastion of all knowledge, you&rsquo;re going to get in trouble quickly. If they take a vacation and the entire team falls apart, there&rsquo;s a problem. Information is meant to be shared and made easily accessible.</p>

<p>Memorization has the horrid side-effect of blinding your team from bad documentation and process. By building tools, you won&rsquo;t have to explain to the poor new hire the hundreds of caveats of even starting to develop your application and nonsensical process that was allowed to grow over time.</p>

<h2>But Here We Are</h2>

<p>Yet given this, there&rsquo;s a perverse obsession with reciting algorithms from the book, quoting man pages, and all forms of memory-backed questions. It&rsquo;s a double standard. We interview on the metric of memory, yet any sane coder will go into a panic attack given a completely memory based employee without a fierce knack for automation and tooling.</p>

<h2>Then what&rsquo;s a better way?</h2>

<p>Do away with pre-interviews, all they do is filter out potentially great people with knowledge they may not have in immediate memory. Instead, ask them what they&rsquo;ve built, what makes them tick, what has them up plugging away. You&rsquo;ll learn far more from getting someone&rsquo;s story than asking them five quick questions.</p>

<p>Avoid anything based in reciting man-pages and algorithm books. Instead, seek to either pair with the person or have them demonstrate on a small project. Dig through one of their already built projects, do something practical. Whatever tool they have available to them as a developer should be fair game. If you really want to be bold, have them bring their own laptops in to see how they work.</p>

<p>The point is to learn if this person can contribute to your team, not chant Dijkstra&rsquo;s Algorithm and write out a Quick Sort. If you&rsquo;re not going to be working on it daily, it does not belong in an interview.</p>

<h2>Parting Thoughts</h2>

<p>&ldquo;Everybody is a genius. But if you judge a fish by its ability to climb a tree, it will live its whole life believing that it is stupid.&rdquo; - Albert Einstein</p>
]]></content>
  </entry>
  
  <entry>
    <title type="html"><![CDATA[The World is a Functional Program]]></title>
    <link href="http://www.baweaver.com/blog/2015/02/20/the-world-is-a-functional-program/"/>
    <updated>2015-02-20T22:01:46-08:00</updated>
    <id>http://www.baweaver.com/blog/2015/02/20/the-world-is-a-functional-program</id>
    <content type="html"><![CDATA[<h1>Intro</h1>

<p>What if the world was, in its entirety, a functional program? Through the discovery of mathematics and pure functions, we can derived the process in which lead to our world.</p>

<p><img src="http://imgs.xkcd.com/comics/lisp.jpg" alt="We lost the documentation on quantum mechanics.  You'll have to decode the regexes yourself." /></p>

<!-- more -->


<p>This current variant is still in need of refinement, comments are appreciated as I clean up around the edges. Working on rewriting the current code segments and writing new ones in Lisp for added effect</p>

<h1>Divine Recursion</h1>

<p>Take a simple recursive function, factorial:</p>

<figure class='code'><figcaption><span></span></figcaption><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
<span class='line-number'>3</span>
</pre></td><td class='code'><pre><code class='ruby'><span class='line'><span class="k">def</span> <span class="nf">factorial</span><span class="p">(</span><span class="n">n</span><span class="p">)</span>
</span><span class='line'>  <span class="n">n</span> <span class="o">&lt;</span> <span class="mi">2</span> <span class="o">?</span> <span class="mi">1</span> <span class="p">:</span> <span class="n">n</span> <span class="o">*</span> <span class="n">factorial</span><span class="p">(</span><span class="n">n</span><span class="o">-</span><span class="mi">1</span><span class="p">)</span>
</span><span class='line'><span class="k">end</span>
</span></code></pre></td></tr></table></div></figure>


<p>We have established a base case in which a constant can be derived, numbers less than two will always return one. The flaw of current world views is that we assume creation to have occurred as a constant base case, and seek the answer there.</p>

<p>What really happened was something different entirely. While we quibble over how to find the base constants of our world, we miss a very crucial fact of inception: who called the function that started the divine recursion?</p>

<h1>Closures</h1>

<p>In programming we have the concept of a closure, where a function closes over a value:</p>

<figure class='code'><figcaption><span></span></figcaption><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
<span class='line-number'>3</span>
<span class='line-number'>4</span>
<span class='line-number'>5</span>
<span class='line-number'>6</span>
<span class='line-number'>7</span>
<span class='line-number'>8</span>
<span class='line-number'>9</span>
<span class='line-number'>10</span>
</pre></td><td class='code'><pre><code class='ruby'><span class='line'><span class="n">closure_fn</span> <span class="o">=</span> <span class="o">-&gt;</span> <span class="p">{</span>
</span><span class='line'>  <span class="n">my_data</span> <span class="o">=</span> <span class="s1">&#39;Foo&#39;</span>
</span><span class='line'>
</span><span class='line'>  <span class="o">-&gt;</span> <span class="nb">name</span> <span class="p">{</span>
</span><span class='line'>    <span class="s2">&quot;Hello, </span><span class="si">#{</span><span class="nb">name</span><span class="si">}</span><span class="s2">. </span><span class="si">#{</span><span class="n">my_data</span><span class="si">}</span><span class="s2">!&quot;</span>
</span><span class='line'>  <span class="p">}</span>
</span><span class='line'><span class="p">}</span><span class="o">.</span><span class="n">call</span>
</span><span class='line'>
</span><span class='line'><span class="c1"># This returns a function that we can now call:</span>
</span><span class='line'><span class="n">closure_fn</span><span class="o">.</span><span class="n">call</span><span class="p">(</span><span class="s1">&#39;Brandon&#39;</span><span class="p">)</span> <span class="c1"># =&gt; &quot;Hello, Brandon. Foo!&quot;</span>
</span></code></pre></td></tr></table></div></figure>


<p>The function inside captures the state around itself, enclosing it, or rather creating a closure over it.</p>

<p>Our world is the result of a closure in which something defined our function with a set of constant values outside of our function, but inside of our scope of knowledge. This is how we derive logic, time, and the rules of the world in general. This begs the question though, how did it get called? A transcendence, much like a programmer that calls a function.</p>

<h1>Free Will and Branch Theory</h1>

<p>When a recursive problem approaches a function that can go down many paths, its result is not necessarily known by the one who invoked it. Given certain conditionals, a branch may be permanently trimmed off the world tree as it approaches its absolute single-branch return value.</p>

<p>While the invoker may not know the result of the branch, they may be able to make certain guesses about the nature of its execution. With enough insight and knowledge of a program, you can predict the results of a tree. That makes it sound as if there&rsquo;s no such thing as free will and there&rsquo;s an inevitable predestination, but here&rsquo;s the brilliant part: it&rsquo;s not.</p>

<h1>Callbacks</h1>

<p>Inside each execution loop of the world tree, a callback is invoked in which an external function can be reached. This can be considered much the equivalent of prayer, sacrifice, meditation, and other spiritual activities. Given that these callbacks do not prevent the execution of malicious code, bad things are able to happen as a result of misusing them (Oija boards, summonings, occult, falling from grace.)</p>

<p>Through this process of callbacks in the tree, the execution order is now unknown to even the invoker. Even at that, the knowledge of the inner workings of the program will still lend considerably more insight into the path of execution than will be known by the data (or person.)</p>

<p>In a way, it&rsquo;s a solved game. Much like a supercomputer playing chess, the end result was decided before the game even began. Free will is the result of the game itself being played in the interim around the fixed endpoints.</p>

<h1>Laziness, Currying, and Partial Application</h1>

<p>Given the process of callbacks, the entire world tree is already effectively defined but the functions have not been called as all the data is not present.</p>

<figure class='code'><figcaption><span></span></figcaption><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
<span class='line-number'>3</span>
<span class='line-number'>4</span>
<span class='line-number'>5</span>
<span class='line-number'>6</span>
<span class='line-number'>7</span>
</pre></td><td class='code'><pre><code class='ruby'><span class='line'><span class="n">plus_one</span> <span class="o">=</span> <span class="o">-&gt;</span> <span class="n">x</span> <span class="p">{</span>
</span><span class='line'>  <span class="o">-&gt;</span> <span class="n">y</span> <span class="p">{</span> <span class="n">x</span> <span class="o">+</span> <span class="n">y</span> <span class="p">}</span>
</span><span class='line'><span class="p">}</span><span class="o">.</span><span class="n">call</span><span class="p">(</span><span class="mi">1</span><span class="p">)</span>
</span><span class='line'>
</span><span class='line'><span class="n">plus_one</span> <span class="c1"># =&gt; anonymous function</span>
</span><span class='line'>
</span><span class='line'><span class="n">plus_one</span><span class="o">.</span><span class="n">call</span><span class="p">(</span><span class="mi">2</span><span class="p">)</span> <span class="c1"># =&gt; 3</span>
</span></code></pre></td></tr></table></div></figure>


<p>Without all the data being present, a function will not be called. When provided with its last parameter, the value will be returned and a branch can be derived.</p>

<p>By currying our choices and states along the tree, we build up towards the execution of functions that will change the branch we&rsquo;re currently on. The world tree is lazy in nature, it will not execute branch changes until it has all the data necessary.</p>

<h1>Evil in the form of exceptions</h1>

<p>The crux to giving the ability for callbacks and laziness in functions is that errors can and will be raised. Ones that are outside the influence of the invoker due to the nature of the function. Does this undermine the omnipotence of the invoker? No, as they had already provided rescue conditions throughout the application to save data from exceptions.</p>

<figure class='code'><figcaption><span></span></figcaption><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
<span class='line-number'>3</span>
<span class='line-number'>4</span>
<span class='line-number'>5</span>
</pre></td><td class='code'><pre><code class='ruby'><span class='line'><span class="k">def</span> <span class="nf">saved</span><span class="p">(</span><span class="n">function</span><span class="p">,</span> <span class="n">data</span><span class="p">)</span>
</span><span class='line'>  <span class="n">function</span><span class="p">(</span><span class="n">data</span><span class="p">)</span>
</span><span class='line'><span class="k">rescue</span>
</span><span class='line'>  <span class="n">outer_context</span><span class="p">(</span><span class="n">data</span><span class="p">)</span>
</span><span class='line'><span class="k">end</span>
</span></code></pre></td></tr></table></div></figure>


<h1>Monadic state</h1>

<p>So then how do we reconcile with a young earth versus an old? We don&rsquo;t. Both are plausible at the same time with the presence of monadic state, a seed of sorts. By invoking the world tree with a set of predefined knowledge, time can be simulated, elongated, or generally distorted beyond the current rules of our world tree.</p>

<figure class='code'><figcaption><span></span></figcaption><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
<span class='line-number'>3</span>
<span class='line-number'>4</span>
<span class='line-number'>5</span>
</pre></td><td class='code'><pre><code class='ruby'><span class='line'><span class="k">def</span> <span class="nf">world_tree</span><span class="p">(</span><span class="n">state</span><span class="p">)</span>
</span><span class='line'>  <span class="n">some_execution_chain</span><span class="p">(</span><span class="n">state</span><span class="p">)</span>
</span><span class='line'><span class="k">end</span>
</span><span class='line'>
</span><span class='line'><span class="n">world_tree</span><span class="p">(</span><span class="ss">logic</span><span class="p">:</span> <span class="n">rules</span><span class="p">,</span> <span class="ss">entities</span><span class="p">:</span> <span class="n">creations</span><span class="p">)</span>
</span></code></pre></td></tr></table></div></figure>


<p>Much like a dream to us, we view something as always present, predefined. Perhaps our concept of time is warped by the seed data in such a way that we observe something beyond our functional world tree. As it recurses, it carries with it the state that could have easily been arbitrarily defined along the way.</p>

<h1>Evolution and Functional Composition</h1>

<p>Evolution is also a result of seed data, but more thoroughly of functional composition:</p>

<figure class='code'><figcaption><span></span></figcaption><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
</pre></td><td class='code'><pre><code class='ruby'><span class='line'><span class="n">organism</span><span class="p">(</span><span class="n">cell</span><span class="p">(</span><span class="n">protein</span><span class="p">(</span><span class="n">x</span><span class="p">)))</span>
</span></code></pre></td></tr></table></div></figure>


<p>Functions are composed upon one another such that the pattern that composes a monkey may well be an earlier variant of a human that has not had all of its functional chain called through. This is what leads to similarities of DNA, a monkey would merely be a human without the remaining functions between them.</p>

<p>We see evolution, but in reality it&rsquo;s the base functions that have been built from the ground up in order to create us and the creatures around us.</p>

<h1>The apocalypse and the return</h1>

<p>You remember the presence of constants in the system? They were never meant to be the beginning, but the end of the chain. If the end is already known, and the beginning was made from seed data, the process in the middle is left largely to the result of execution.</p>

<p>At the end of our world tree function, and when certain branches are returned, our state is transferred into the closure above us, more commonly known as our heavens and hells. These returns will only happen when a function has called through an entire branch at the end of the tree, known as the apocalypse.</p>

<p>In the interim, we&rsquo;re stored in a state that&rsquo;s carried throughout the remainder of the world tree, in what would be called as Limbo.</p>

<p>Given that the function was invoked by an outside source, certain code may have been arbitrarily introduced in such a way to allow new state to manifest itself at certain points of the chain in ways that again defy our given rules. This can lead to such things as a virgin birth, resurrection, and even a return as the end condition itself.</p>

<p>The world is a functional program, and we are the data that flows through it.</p>
]]></content>
  </entry>
  
  <entry>
    <title type="html"><![CDATA[The Transcendent Turtle]]></title>
    <link href="http://www.baweaver.com/blog/2014/10/30/the-transcendent-turtle/"/>
    <updated>2014-10-30T20:16:31-07:00</updated>
    <id>http://www.baweaver.com/blog/2014/10/30/the-transcendent-turtle</id>
    <content type="html"><![CDATA[<p>To many, Minecraft was a gateway drug to the world of technology. Redstone was a novel idea that let us experiment with some circuitry, make traps, and in general create more dynamic things. It&rsquo;s great for the basics, and a lot of fun to work with, but the interesting thing about it is that you&rsquo;re already starting to program by using it. Why not take it a step further? Computercraft gives you the power to jump into a full programming environment inside Minecraft using Lua.</p>

<!-- more -->


<h2>But I&rsquo;m not a Programmer!</h2>

<p>Neither are the people who frequently play Minecraft. Really, you don&rsquo;t even have to know how to program to get the benefits of the mod. Programmers are a peculiar breed who love to share their creations publicly. That means you can get some amazing tools and scripts from brilliant people simply by looking through the <a href="http://www.computercraft.info/forums2/">Computercraft Forums</a></p>

<p>Say you can&rsquo;t find it but you want to make something. There are <a href="https://www.youtube.com/watch?v=DSsx4VSe-Uk">tons</a> <a href="https://www.youtube.com/watch?v=bnKuOJOaWIA">of</a> <a href="https://www.youtube.com/watch?v=H5a7S4eF7zw">tutorials</a> <a href="https://www.youtube.com/watch?v=3zUEprIoFwA">out</a> <a href="https://www.youtube.com/watch?v=1vK5rOkiW7g">there</a> for how to get started with computercraft.</p>

<h2>It&rsquo;s going to be hard!</h2>

<p>If you already use Redstone, you&rsquo;re working with a lot harder material already. All those logic gates you use to get basic doors to work? What if you could just put a password on it? Simple:</p>

<figure class='code'><figcaption><span></span></figcaption><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
<span class='line-number'>3</span>
<span class='line-number'>4</span>
<span class='line-number'>5</span>
<span class='line-number'>6</span>
<span class='line-number'>7</span>
<span class='line-number'>8</span>
<span class='line-number'>9</span>
<span class='line-number'>10</span>
<span class='line-number'>11</span>
</pre></td><td class='code'><pre><code class='lua'><span class='line'><span class="c1">-- Reference for more advanced: http://computercraft.info/wiki/Making_a_Password_Protected_Door</span>
</span><span class='line'>
</span><span class='line'><span class="k">while</span> <span class="kc">true</span>                            <span class="c1">-- we want to keep the program going</span>
</span><span class='line'>  <span class="nb">print</span><span class="p">(</span><span class="s2">&quot;</span><span class="s">What is your quest?: &quot;</span><span class="p">)</span>       <span class="c1">-- Give them a prompt to let them know what you want</span>
</span><span class='line'>  <span class="n">input</span> <span class="o">=</span> <span class="n">read</span><span class="p">(</span><span class="s2">&quot;</span><span class="s">*&quot;</span><span class="p">)</span>                   <span class="c1">-- Read in their input</span>
</span><span class='line'>  <span class="k">if</span> <span class="n">input</span> <span class="o">==</span> <span class="s2">&quot;</span><span class="s">holygrail&quot;</span> <span class="k">then</span>        <span class="c1">-- Is the input the password we want?</span>
</span><span class='line'>    <span class="n">redstone</span><span class="p">.</span><span class="n">setOutput</span><span class="p">(</span><span class="s2">&quot;</span><span class="s">back&quot;</span><span class="p">,</span> <span class="kc">true</span><span class="p">)</span>  <span class="c1">-- Send a redstone current behind the computer</span>
</span><span class='line'>    <span class="n">sleep</span><span class="p">(</span><span class="mi">2</span><span class="p">)</span>                          <span class="c1">-- Wait a few seconds</span>
</span><span class='line'>    <span class="n">redstone</span><span class="p">.</span><span class="n">setOutput</span><span class="p">(</span><span class="s2">&quot;</span><span class="s">back&quot;</span><span class="p">,</span> <span class="kc">false</span><span class="p">)</span> <span class="c1">-- Lock it again</span>
</span><span class='line'>  <span class="k">end</span>
</span><span class='line'><span class="k">end</span>
</span></code></pre></td></tr></table></div></figure>


<p>No having to reference that long image about logic gates, that&rsquo;s it. Welcome to the concept of programming abstractions. Redstone was a low level language, and Lua is a lot higher level.</p>

<p>Surely those mining robots are harder to make though. Not really. Want to make it</p>

<figure class='code'><figcaption><span></span></figcaption><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
<span class='line-number'>3</span>
<span class='line-number'>4</span>
<span class='line-number'>5</span>
<span class='line-number'>6</span>
<span class='line-number'>7</span>
<span class='line-number'>8</span>
<span class='line-number'>9</span>
<span class='line-number'>10</span>
</pre></td><td class='code'><pre><code class='lua'><span class='line'><span class="c1">-- Reference for more advanced: http://pastebin.com/73gH7BUL</span>
</span><span class='line'>
</span><span class='line'><span class="nb">print</span><span class="p">(</span><span class="s2">&quot;</span><span class="s">How far we going boss?: &quot;</span><span class="p">)</span> <span class="c1">-- Ask them how far to go</span>
</span><span class='line'><span class="n">distance</span> <span class="o">=</span> <span class="n">read</span><span class="p">(</span><span class="s2">&quot;</span><span class="s">*&quot;</span><span class="p">)</span>              <span class="c1">-- Get the distance</span>
</span><span class='line'>
</span><span class='line'><span class="k">for</span> <span class="n">i</span> <span class="o">=</span> <span class="mi">0</span><span class="p">,</span> <span class="n">distance</span> <span class="k">do</span> <span class="c1">-- For the numbers 0 up to the distance that was entered...</span>
</span><span class='line'>  <span class="n">turtle</span><span class="p">.</span><span class="n">dig</span><span class="p">()</span>     <span class="c1">-- Dig in front</span>
</span><span class='line'>  <span class="n">turtle</span><span class="p">.</span><span class="n">forward</span><span class="p">()</span> <span class="c1">-- Move forward</span>
</span><span class='line'>  <span class="n">turtle</span><span class="p">.</span><span class="n">digUp</span><span class="p">()</span>   <span class="c1">-- Dig above</span>
</span><span class='line'><span class="k">end</span>  <span class="c1">-- ...and repeat!</span>
</span></code></pre></td></tr></table></div></figure>


<h2>I don&rsquo;t even know what they can do</h2>

<p>That&rsquo;s what the <a href="http://computercraft.info/wiki/Main_Page">Wiki Pages</a> are for! Tons of information on how turtles work, what commands they can run, and various other handy bits.</p>

<h2>Typing on that terminal is annoying</h2>

<p>I agree, and I don&rsquo;t bother with it either. I type my code and post it on <a href="http://pastebin.com/">Pastebin</a>, and then just download it to the turtle like this:</p>

<figure class='code'><figcaption><span></span></figcaption><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
</pre></td><td class='code'><pre><code class='lua'><span class='line'><span class="c1">-- From http://pastebin.com/73gH7BUL</span>
</span><span class='line'><span class="n">pastebin</span> <span class="n">get</span> <span class="mi">73</span><span class="n">gH7BUL</span> <span class="n">digger</span>
</span></code></pre></td></tr></table></div></figure>


<p>&hellip;where <a href="http://pastebin.com/73gH7BUL">73gH7BUL</a> is the url hash of the pastebin, and digger is the name we want to save the program as. All we need to do to use it now is to type in digger in the terminal and it&rsquo;s off on its merry way.</p>

<h2>It defeats the purpose of the game</h2>

<p>It really depends on who you ask. To me, the purpose is to build cool things, not spend forever gathering the resources to make it happen. Computercraft allows you to automate a lot of that work, and the nice thing is that most of the scripts for common things like digging tunnels and stairs are already out there for you to use.</p>

<p>If you feel content spending hours on mangling redstone to do what you can do in under 20 lines of Lua in a few minutes, more power to you. Best hope you didn&rsquo;t make a mistake, or you&rsquo;ll end up digging the entire thing up again. To me, it enhances the game by allowing you to get more done faster.</p>

<h2>Tunnels? Lame.</h2>

<p>How about a swarm of mining turtles controlled by a boss?: <a href="https://www.youtube.com/watch?v=g5153BiTNI8">https://www.youtube.com/watch?v=g5153BiTNI8</a></p>

<p>3D Printing from a turtle GUI Paint program?: <a href="https://www.youtube.com/watch?v=AuofE9dqiuU">https://www.youtube.com/watch?v=AuofE9dqiuU</a></p>

<p>Youtube videos in Minecraft?: <a href="https://www.youtube.com/watch?v=tpqOv7SxkHA">https://www.youtube.com/watch?v=tpqOv7SxkHA</a></p>

<p>Maybe a massive villager shopping mall: <a href="https://www.youtube.com/watch?v=Xasa_Jr-lcI">https://www.youtube.com/watch?v=Xasa_Jr-lcI</a></p>

<p>Though a Minecart Station may be your thing: <a href="https://www.youtube.com/watch?v=ws4iDwLc0zQ">https://www.youtube.com/watch?v=ws4iDwLc0zQ</a></p>

<p>The point is, if you can imagine it, someone has probably already built it. If not, you can make it. Of course the more advanced you get the harder it&rsquo;ll be, and programming can get hard past the trivial stuff. It takes time, but you can ask on the forums to get the help you need.</p>

<h2>It favors Programmers</h2>

<p>Well, yeah, it is programming in Minecraft. Experienced programmers will have an edge. The good thing is that most programmers love sharing their toys, and love it even more when people use them and thank them for it. The thing to remember is that all of the seriously advanced programs out there take days to weeks to complete, so they&rsquo;re not getting an easier time necessarily.</p>

<p>There are already tons of scripts online of all types to download that will have you running at about the same level as any mid-range developer, and they&rsquo;re even documented. Even if there is a veteran on the server, chances are they like to share as well. Just ask some time.</p>

<h2>I don&rsquo;t want to have to redownload things</h2>

<p>Yeah, me either. If you&rsquo;re sufficiently advanced you&rsquo;re going to run into the issue of remaking your programs and having different versions out there. There are a few of us out there crazy enough to try and fix that issue with an entire deployment management system for turtles like <a href="https://github.com/baweaver/tortuga">Tortuga</a> (WIP) which will take care of a lot of that. Think Opscode Chef for turtles.</p>
]]></content>
  </entry>
  
  <entry>
    <title type="html"><![CDATA[A Burgeoning Blog]]></title>
    <link href="http://www.baweaver.com/blog/2014/10/20/a-burgeoning-blog/"/>
    <updated>2014-10-20T20:56:07-07:00</updated>
    <id>http://www.baweaver.com/blog/2014/10/20/a-burgeoning-blog</id>
    <content type="html"><![CDATA[<p>Why do people bother to blog? Very few must have anything truly breathtaking to say, at least not to the caliber of other writers already out there. I get it, it&rsquo;s intimidating to publish when there&rsquo;s already so much good content out there already. There&rsquo;s always a fear of looking the fool, saying the wrong thing, or otherwise just doing a poor job of it. So why should you even bother?</p>

<!-- more -->


<h2>Relative to What?</h2>

<p>Open up Github, or whatever code store you may have, and take a look at the code you&rsquo;ve written even a few months ago. Chances are high you&rsquo;re cringing a bit at some of the things you&rsquo;ve written, patterns you&rsquo;ve tried, or even lack of testing. If you had the time, you&rsquo;d likely think of refactoring the entire thing, and doing it right this time.</p>

<p>That urge is one of the most compelling reasons you could ask for to start writing. That experience that transformed the way you think about code is a valuable thing, and worth sharing. It doesn&rsquo;t matter if the realization was that you shouldn&rsquo;t use <code>eval</code> in your code or that an abstraction could have saved several hours of time in the future, it&rsquo;s valuable.</p>

<h2>A Long Road Ahead</h2>

<p>Every programmer will find themselves at a different stage of experience, many looking for someone who went through the same trials they did. By writing, you&rsquo;ve given that person a resource on which they can build and grow as you did. You&rsquo;ve given them a map to guide them out of a potential pitfall that you&rsquo;ve once encountered, and by doing so you&rsquo;ve helped them move faster than they would have on their own.</p>

<p>Many a new programmer will find themselves terrified by the complexity that most of us take for granted. The hours of hacking away at a terminal just to get your first Rails or Node server running, the perils of deploying your first code, the nightmares of your first testing suite, these experiences are not to be undervalued. Writing a post explaining any of the things you had to fight through just to see that glorious <code>hello world</code> on the screen for the first time may be just the hope someone else needs to keep going.</p>

<h2>A Great Distance Traveled</h2>

<p>You&rsquo;ll find that as you blog, you can learn far more about yourself. You can chart where you&rsquo;ve been, what you&rsquo;ve learned over the years, and trace the path to what you&rsquo;ve become. It&rsquo;s a warm feeling to be able to point back at your earlier writings and say &ldquo;I was there too, once, and I made it.&rdquo;</p>

<p>Write, and share your struggles so that others may be lifted above them on the shoulders of giants.</p>
]]></content>
  </entry>
  
  <entry>
    <title type="html"><![CDATA[Sanctimoniously Self-Made]]></title>
    <link href="http://www.baweaver.com/blog/2014/10/18/sanctimoniously-self-made/"/>
    <updated>2014-10-18T21:50:00-07:00</updated>
    <id>http://www.baweaver.com/blog/2014/10/18/sanctimoniously-self-made</id>
    <content type="html"><![CDATA[<p>In this industry, and especially in American in general, we place great value on being a self-made person. Beating the odds, overcoming adversity, and coming out on top. I fear that such an attitude is extremely toxic for one key reason: there&rsquo;s no such thing as being self-made.</p>

<!-- more -->


<p>This industry has a very grave problem in which we delude ourselves into thinking that our achievements and accolades are due solely to our own work. While it&rsquo;s critically important to work hard and learn, I feel that most miss the point. Behind the story of every towering success, every captain of industry, are people who helped get them there.</p>

<h2>Every Legend has a Story</h2>

<p>No one starts out a legend, that&rsquo;s for after the story has already been written. They become legend over time with the help of friends and colleagues. Steve Jobs had Wozniak, yet we rarely hear mention of him. Bill Gates had Allan, and again the crickets chirp. Why are we so wrapped up in heralding one person instead of the entire group?</p>

<p>What this has led to are a collection of people who believe they owe nothing to the worlds that raised them. They come to believe in the terrifying notion that people not in their position are not as hard working or not as dedicated. That may be the case in some matters, but often times it&rsquo;s far from the truth.</p>

<h2>Pay it Forward</h2>

<p>I strongly believe that we in the industry have an obligation to pay it forward. All the time people have spent investing in us should be given back to the community, whether that be mentoring, connecting, or even helping to pay someones way. Remember it wasn&rsquo;t long ago that you may well have been in their same position, dazed and confused.</p>

<p>There have been several people in my life that have contributed to me getting to where I am today, and I thank them for investing so much time and effort. I wouldn&rsquo;t be here if not for them. From the patience of my High School tech teacher, to the hard-nosed Unix professor in College, and to the man who taught me everything I knew starting out when no one else in the area could understand what I was talking about.</p>

<p>If you know such people in your life, open a new tab and thank them. Remember what they&rsquo;ve done for you, and realize that there are yet more people coming up that could use you in much the same way.</p>

<h2>Seeking Seniors</h2>

<p>No one starts out a grizzled veteran or proficient programmer, and it&rsquo;s time we realize this.</p>

<p>The current trend is not sustainable. We look for Seniority when we fail to invest in bringing people to that level. Colleges pump out fresh new programmers to meet a need that we refuse to fill, instead defaulting to creating artificial scarcity. If you&rsquo;re in DevOps in San Francisco with a Senior level, take a look at your inbox if you don&rsquo;t believe it. It&rsquo;s not unusual for me to see 10+ messages a day at a Mid level.</p>

<h2>Juniors with 3+ years!?</h2>

<p>We set expectations for Junior positions to 3+ years experience, and fail to mention anything of Entry Level. It&rsquo;s no wonder there can be such a panic on graduation. Meanwhile, there are some extremely clever people flying below the radar because your HR department is hard-nosed on time based experience. By foregoing this, you&rsquo;re missing out on an extremely passionate demographic of people.</p>

<p>That means being willing to hire a few Juniors instead of insisting on Senior levels. That means being willing to take on College Students to show them the industry, and what to expect.</p>

<h2>Stacking the Odds</h2>

<p>So what if graduates don&rsquo;t have your entire stack mastered? Can they learn? Are they willing? Honestly, you should also be asking yourself if you had even half the skills at that stage of life. The answer is most likely no, so why expect it from someone just entering?</p>

<p>By listing so much on a requirement, you may well be scaring off some truly brilliant people with a greater than average amount of modesty, a trait this industry sorely needs more of.</p>

<h2>A Final Thought</h2>

<p>To put this article as succinctly as possible: Invest in the future, or by the time you get there it won&rsquo;t be worth anything.</p>
]]></content>
  </entry>
  
  <entry>
    <title type="html"><![CDATA[Reveling in REST]]></title>
    <link href="http://www.baweaver.com/blog/2014/10/04/revelling-in-rest/"/>
    <updated>2014-10-04T20:03:43-07:00</updated>
    <id>http://www.baweaver.com/blog/2014/10/04/revelling-in-rest</id>
    <content type="html"><![CDATA[<p>Services are always the last thing anyone mentions in conjunction with Angular, despite being the single most important part. It rather well kills the point of having a frontend framework if you can&rsquo;t even get data properly from your respective backend. Great! Now that we have that out of the way, let&rsquo;s dive right into how to implement a RESTful client in Angular to get you up on your data in high fashion!</p>

<p>&hellip;except if you&rsquo;ve already tried, you noticed quite the disconcerting truth. Angular is Javascript, and like its kin it has as many implementations of REST as there are people who are capable of making one. A few might chuckle to themselves on <a href="http://www.winestockwebdesign.com/Essays/Lisp_Curse.html">the fulfillment of the LISP curse</a>, but fret not! There is hope yet, or at very least someone with enough patience to lay out a few options worth looking into so you don&rsquo;t have to.</p>

<!-- more -->


<p>We&rsquo;re going to cover some of the more popular options out there, some of their strengths, and where they&rsquo;re going to quickly become a thorn in your side. For those unaware of the LISP curse, it&rsquo;s quite simply that the language is so powerful that everything becomes a social issue. Javascript is very close to LISP in terms of expressiveness, and as a result suffers from a lot of the same effects. Hundreds of half baked implementations of what you want, with very few ever offering a full package beyond the all too common &ldquo;It solved my problem fine&rdquo; hack library.</p>

<h2><a href="https://docs.angularjs.org/api/ng/service/$http">$http</a></h2>

<p>The built in http methods, also known as rolling your own service.</p>

<h3>The Good</h3>

<p>This is the low level of making a request. As long as it fits in the scheme of HTTP you can define it here. This gives you a lot of power to take care of those fine little details.</p>

<h3>The Bad</h3>

<p>The problem with having that type of power is that very very rarely can anything be described as special enough in a RESTful framework to necessitate fine grained control. If it does, you&rsquo;re likely doing something very wrong and need to look at your implementation a bit more carefully.</p>

<h3>The Ugly</h3>

<p>When I say low level, I mean it. You have to handle setting up every method for every type of request. The only way to really overcome this is to set up a base service and define common methods, but by the time you do that, you&rsquo;re already a great deal of the way to items further down the list.</p>

<p>If you see yourself there, it&rsquo;s time to take a sober look in the mirror and ask yourself if you really want to invent the next RESTful service handler in Angular. Nothing against that if you have something clever, but chances are you just want to get work done, or at least you&rsquo;re supposed to be getting it done.</p>

<p>Yes, it&rsquo;s easy. Yes, you could probably make something pretty spiffy. Yes, it would fit your needs like a glove. No, you probably don&rsquo;t have the time to maintain every little detail of it if you manage to make a mistake on it. Use something already out there unless you really need that level of granularity. The LISP curse needs no more help propagating itself into the Javascript world.</p>

<h2><a href="https://docs.angularjs.org/api/ngResource/service/$resource">$resource</a></h2>

<p>An abstraction beyond <a href="https://docs.angularjs.org/api/ng/service/$http">$http</a> allowing you to define a lot of the methods at once.</p>

<h3>The Good</h3>

<p>Unlike <a href="https://docs.angularjs.org/api/ng/service/$http">$http</a> this allows you to define all of the resources in one swoop.</p>

<h3>The Bad</h3>

<p>The Documentation is neigh unreadable and you&rsquo;ll spend plenty of time fumbling through blog posts and whatever books you can find to get a solid implementation of them</p>

<h3>The Ugly</h3>

<p>It took a while for them to get to promises</p>

<h2><a href="https://github.com/mgonto/restangular">RestAngular</a></h2>

<p>Touted as solving a lot of the annoyances with <a href="https://docs.angularjs.org/api/ngResource/service/$resource">$resource</a></p>

<h3>The Good</h3>

<p>You probably won&rsquo;t need to bother with making Services, you can just drop in Restangular and use it in your controllers. Everything from that point on is a Restangular object you can call through on, and they all return promises. Very handy to get a lot of code out of repetitive services.</p>

<p>Want to do something out of the usual? Restangular can do it. You get custom methods for sending new types of requests, and you can even define your own methods on it.</p>

<h3>The Bad</h3>

<p>Hopefully you like lodash (I do), because it&rsquo;s a required dependency. This one is debatable, as I&rsquo;m of the opinion that people should be using it more as is, but I digress.</p>

<p>What about relationships? You&rsquo;re going to end up with a lot more code there, especially on trying to get many to many relationships to behave in anything that resembles coherence.</p>

<p>The custom methods are nice, but you&rsquo;re going to very quickly see your controllers start looking like half-baked services. If you&rsquo;re finding yourself defining a ton of custom methods for unique methods, you&rsquo;ll find yourself going back towards services very quickly. Granted, that likely means you need to redesign systems on the backend, and of course there&rsquo;s nothing against abstracting Restangular into base controllers either.</p>

<h3>The Ugly</h3>

<p>These objects can become heavyweight fast. All the Restangular methods are getting appended to the objects meaning you&rsquo;re passing around a lot more data. If you send a <code>POST</code> or <code>PUT</code> request to create or update something, you had better hope you&rsquo;re filtering paramaters.</p>

<p>You&rsquo;ll end up getting a mouthful of Restangular chaff on every object you&rsquo;re trying to send up, and the only way to get around this one is to run a cleaner on it. To me it seems like far too much work being done for far too little extra gain.</p>

<h2><a href="https://github.com/jmdobry/angular-data">Angular Data</a></h2>

<p>Eventually you get fed up with all of this and decide that there has to be a better way to manage all of this. If you&rsquo;ve noticed a trend so far, it&rsquo;s that each successive recommendation is an abstraction on the last. Angular Data is the culmination of getting far too pissed off at implementing base services and other nonsense trying to get Angular to behave coherently.</p>

<h3>The Good</h3>

<p>You can define relationships, have resources defined much in the same way as a Rails-like framework, and even bind data to the scope with very little extra code.</p>

<p>That means you get fun like <code>hasMany</code>, <code>hasOne</code>, and <code>belongsTo</code>. No more needing to create extra methods to get at nested data, and no need to repetitively define relationships.</p>

<p>The author is extremely responsive to issues and is known to have feedback within the day. Many of the frustrations above were things that he&rsquo;d cited as reasons for creating this framework.</p>

<h3>The Bad</h3>

<p>I&rsquo;ve yet to have found anything compelling against it to this point, except that you need to be very specific in telling it how your server responds to queries.</p>

<h1>So what wins?</h1>

<p>Really it depends on how much horsepower you need to get tasks done, but as of now I would still put Restangular as the go-to for most occasions with angular-data being a very interesting up and coming framework.</p>
]]></content>
  </entry>
  
  <entry>
    <title type="html"><![CDATA[The Frivolous Frontend Framework]]></title>
    <link href="http://www.baweaver.com/blog/2014/10/02/the-frivolous-frontend-framework/"/>
    <updated>2014-10-02T19:46:01-07:00</updated>
    <id>http://www.baweaver.com/blog/2014/10/02/the-frivolous-frontend-framework</id>
    <content type="html"><![CDATA[<p>Many a hardened Rails programmer will swear by their ERB or HAML, shaking a fist at the sky decrying the acolytes from the land of Javascript for their frivolous frontend frameworks.</p>

<p>&ldquo;Who needs them!&rdquo; they say haughtily. &ldquo;jQuery has sustained us perfectly fine, and our applications are not nearly large enough to warrant the extra overhead! Why would any of us use a frontend framework?&rdquo;</p>

<p>Yet there are those of us, standing upon the hill, pilgrims from the unholy land of Ajax Callbacks and Asynchronous Updates, looking upon them with something akin to pity.</p>

<!-- more -->


<h2>The Short Version</h2>

<p>If you&rsquo;re looking here for answers on whether you need a framework, chances are very high that you do. If you&rsquo;ve found your way fumbling about AJAX one too many dark nights whilst imbibing strong drink, it&rsquo;s time to bite the bullet and make a jump to a better place.</p>

<h2>The (Poetically) Long Version</h2>

<p>In the modern day web, dynamic never seems to be dynamic enough for some. Data needs to update live on the page, masses of components need to update and render as if in some practiced dance. You find yourself saying &ldquo;Just one more patch hack and it should work again.&rdquo; How many nights has it been now? Two? It seems you&rsquo;ve lost count. Anything akin to structure is a mad snarl of brambles waiting to take you should you misstep even one unit test.</p>

<p>You cry out in anguish as IE8 fails to render yet again. Surely there must be a better way, but the application is not yet large enough to warrant such an expenditure of effort! What you do are only a few AJAX calls to your APIs, the callbacks have only nested five levels by now. You&rsquo;ll switch when it gets worse, you think to yourself. Only, does anything ever happen in that most special circle of hell known as Technical Debt?</p>

<h3>What a piece of CRUD</h3>

<p>Duplication, everywhere you see. The same basic operations of Create, Read, Update, and Delete. All of which done to the tune of the team who happened to be working on them that particular week. None quite work the same, and all attempts at consolidation and style guides have long since been laughed off as meaningless. The code is littered with edge cases, special hacks, and one time things with promises of removal and cleaning.</p>

<p>It&rsquo;s not that any of the implementations were particularly bad (except for Bobs, that was a mess, how is he still working here again?) They all make sense in their own particular ways, and their creators could speak at great length on their strengths in such a grandiose bravado. You nod vigorously, a great deal of sense is made here! &hellip;but venture you further into the tribe of Neckbeard to here their prophet of the promise speak so eloquently of their path to righteousness. You knuckle your head as you bow out, some poor fool had brought up editors again.</p>

<p>It was quite a quandary, so many made sense but in such different ways. Which was right, who was to say, but then you saw it. The new hotness they had called it on HackerNews, singing its praises in bringing order to the chaos. Skeptically you listened in on the discussion, and rightly so. You seem to recall them saying something about Java and Perl dying again last week for the fifth time this year. Best to take them with a grain of salt. How many had claimed by now to have the one true way? AngularJS stood proud among the rest, EmberJS attracting its crowds as well, while still more frameworks begged attention.</p>

<p>Which one should I investigate? The answer is quite simply that they all have a point, and as to which one is not nearly as important as deciding upon migration before Jira comes to swallow your hopes and dreams.</p>

<h3>AngularJS</h3>

<p>For the sake of this article, I&rsquo;ll cover things in the terms of Angular. I have a great deal of respect for Ember and the work they&rsquo;ve done, but I can&rsquo;t speak nearly as much to its strengths. The purpose of this is far more to show how a front end framework can liberate you from the shackles of the oppressive Raw AJAX.</p>

<p>Organically grown frameworks very seldom work, and more often than not end up becoming a cluster of micro-frameworks that are incomprehensible to all but the most trained in their ways. Best hope there are no rouge buses to rob you of their knowledge. While in the first place it seems like a good idea to allow free reign to interpret ideas and build more creatively, you&rsquo;ll quickly learn that anything that can be considered a social issue to programmers is grounds for a war.</p>

<p>This is why we have style guides and procedures in place, to bring order. No more having to listen to a lengthy discussion on the merit and readability of two spaces versus four, the style guide had set it in stone months ago.</p>

<p>Much the same can be said for a Javascript Frontend Framework. While many would say they&rsquo;re frivolous things, they bring order to the wildness of javascript. If only for that reason I would take them into great consideration, but their effectiveness does not end there.</p>

<p>With Angular, the DOM feels almost a distant memory. Cleanly abstracted from you, table rows can be updated dynamically and the page manipulated with as little as a simple data binding. Search bars for data sets are within perhaps 50 characters at most. Things that would strike horror into a pure jQuery Programmers heart are now trivial, abstracted away to build upon to greater heights.</p>

<p>Now things are tied to directives and actions rather than nodes that may change by a simple accident. The page can be reasoned about as a whole rather than as segmented pieces cobbled together with selectors.</p>

<p>When someone asks me why a frontend framework, I would quite simply reply:</p>

<p>&ldquo;Because the sense of unity and structure they provide far outweighs the price of its implementation&rdquo;</p>

<p>Well, that, and because Angular behaves properly in IE8 as of current versions. That alone is worth its weight in gold.</p>
]]></content>
  </entry>
  
  <entry>
    <title type="html"><![CDATA[Many Woes with Many to Many Relations]]></title>
    <link href="http://www.baweaver.com/blog/2014/10/01/many-woes-with-many-to-many-relations/"/>
    <updated>2014-10-01T21:11:59-07:00</updated>
    <id>http://www.baweaver.com/blog/2014/10/01/many-woes-with-many-to-many-relations</id>
    <content type="html"><![CDATA[<p>Rails provides us with a lot of power in routing and associations, but if you&rsquo;ve ever tried to set up an API with any form of many-to-many relationship, you&rsquo;re in for a nightmare. Google won&rsquo;t save you, the Rails guides are sparse, and there&rsquo;s a grand total of <a href="http://ngauthier.com/2010/11/restful-many-to-many-relationships-in-rails.html">one good blog post</a> on the matter from a few years ago.</p>

<!-- more -->


<h2>Many to Many</h2>

<p>So how does a many to many relationship work? Via an association table containing IDs of both of the resources to be linked. Both then have access to the other collection. It&rsquo;s extremely handy for certain problems, and if you&rsquo;re just using Rails through the view you&rsquo;ll likely never have a problem with it.</p>

<h2>The Fun Starts</h2>

<p>But now you&rsquo;ve heard about this awesome thing called Angular / Ember / New Hot JS Framework that you just have to use. I don&rsquo;t blame you, a few weeks in Angular and I don&rsquo;t want to use Rails Views again. You decide to take the high road and segregate the apps, making Rails an API and using your framework (Angular assumed from here on out) to build out the frontend through calls.</p>

<p>It all works great, you even found <a href="https://github.com/mgonto/restangular">RestAngular</a> to help you out with some of the plumbing. Simple actions are now trivial. Want a list of a Users comments? Easy:</p>

<figure class='code'><figcaption><span></span></figcaption><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
</pre></td><td class='code'><pre><code class='javascript'><span class='line'><span class="c1">// Livescript</span>
</span><span class='line'><span class="nx">RestAngular</span><span class="p">.</span><span class="nx">one</span> <span class="err">\</span><span class="nx">users</span><span class="p">,</span> <span class="mi">1</span> <span class="p">.</span><span class="nx">getList</span> <span class="err">\</span><span class="nx">comments</span> <span class="p">.</span><span class="nx">then</span> <span class="p">(</span><span class="nx">data</span><span class="p">)</span> <span class="o">-&gt;</span> <span class="nx">$scope</span><span class="p">.</span><span class="nx">comments</span> <span class="o">=</span> <span class="nx">data</span>
</span></code></pre></td></tr></table></div></figure>


<h2>But then there are Categories</h2>

<p>RestAngular already has us covered, any other case and we&rsquo;re sailing along. Now we want to add categories to our posts, a many to many relationship. How would we script that one? Likely the first thing you try is this:</p>

<figure class='code'><figcaption><span></span></figcaption><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
</pre></td><td class='code'><pre><code class='javascript'><span class='line'><span class="c1">// Livescript</span>
</span><span class='line'><span class="nx">RestAngular</span><span class="p">.</span><span class="nx">one</span> <span class="err">\</span><span class="nx">posts</span><span class="p">,</span> <span class="mi">1</span> <span class="p">.</span><span class="nx">getList</span> <span class="err">\</span><span class="nx">categories</span> <span class="p">.</span><span class="nx">post</span> <span class="nx">formData</span>
</span></code></pre></td></tr></table></div></figure>


<p>Checking the DB, you&rsquo;ll notice the new association isn&rsquo;t there. Odd. Maybe delete will work?</p>

<figure class='code'><figcaption><span></span></figcaption><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
</pre></td><td class='code'><pre><code class='javascript'><span class='line'><span class="c1">// Livescript</span>
</span><span class='line'><span class="nx">RestAngular</span><span class="p">.</span><span class="nx">one</span> <span class="err">\</span><span class="nx">posts</span><span class="p">,</span> <span class="mi">1</span> <span class="p">.</span><span class="nx">one</span> <span class="err">\</span><span class="nx">categories</span><span class="p">,</span> <span class="mi">1</span> <span class="p">.</span><span class="nx">remove</span><span class="o">!</span>
</span></code></pre></td></tr></table></div></figure>


<p></p>

<p>&hellip;except now for some reason, category one is gone everywhere. Thinking through it, it becomes clear that what we&rsquo;ve done is simply request a nested resource and sent it a delete request.</p>

<h2>So what do you do?</h2>

<p>There&rsquo;s an association table with your name on it called something like PostCategory. Trying to route through either one of the hosts is likely to give you nightmares.</p>

<p>First let&rsquo;s take a look at what your controller action should look like to handle the queries:</p>

<figure class='code'><figcaption><span></span></figcaption><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
<span class='line-number'>3</span>
<span class='line-number'>4</span>
<span class='line-number'>5</span>
<span class='line-number'>6</span>
<span class='line-number'>7</span>
<span class='line-number'>8</span>
<span class='line-number'>9</span>
<span class='line-number'>10</span>
<span class='line-number'>11</span>
<span class='line-number'>12</span>
<span class='line-number'>13</span>
<span class='line-number'>14</span>
<span class='line-number'>15</span>
<span class='line-number'>16</span>
<span class='line-number'>17</span>
<span class='line-number'>18</span>
<span class='line-number'>19</span>
<span class='line-number'>20</span>
<span class='line-number'>21</span>
<span class='line-number'>22</span>
<span class='line-number'>23</span>
</pre></td><td class='code'><pre><code class='ruby'><span class='line'><span class="k">def</span> <span class="nf">index</span>
</span><span class='line'>  <span class="no">PostCategory</span><span class="o">.</span><span class="n">where</span><span class="p">(</span><span class="n">params</span><span class="o">.</span><span class="n">slice</span><span class="p">(</span><span class="ss">:post_id</span><span class="p">,</span> <span class="ss">:category_id</span><span class="p">))</span>
</span><span class='line'><span class="k">end</span>
</span><span class='line'>
</span><span class='line'><span class="k">def</span> <span class="nf">create</span>
</span><span class='line'>  <span class="vi">@post_category</span> <span class="o">=</span> <span class="no">PostCategory</span><span class="o">.</span><span class="n">new</span><span class="p">(</span><span class="n">post_category_params</span><span class="p">)</span>
</span><span class='line'>
</span><span class='line'>  <span class="k">if</span> <span class="vi">@post_category</span><span class="o">.</span><span class="n">save</span>
</span><span class='line'>    <span class="n">render</span> <span class="ss">json</span><span class="p">:</span> <span class="vi">@post_category</span><span class="p">,</span> <span class="ss">status</span><span class="p">:</span> <span class="ss">:createds</span>
</span><span class='line'>  <span class="k">else</span>
</span><span class='line'>    <span class="n">render</span> <span class="ss">json</span><span class="p">:</span> <span class="vi">@post_category</span><span class="o">.</span><span class="n">errors</span><span class="p">,</span> <span class="ss">status</span><span class="p">:</span> <span class="ss">:unprocessable_entity</span>
</span><span class='line'>  <span class="k">end</span>
</span><span class='line'><span class="k">end</span>
</span><span class='line'>
</span><span class='line'><span class="k">def</span> <span class="nf">destroy</span>
</span><span class='line'>  <span class="k">if</span> <span class="n">params</span><span class="o">[</span><span class="ss">:id</span><span class="o">]</span>
</span><span class='line'>    <span class="no">PostCategory</span><span class="o">.</span><span class="n">find</span><span class="p">(</span><span class="n">params</span><span class="o">[</span><span class="ss">:id</span><span class="o">]</span><span class="p">)</span><span class="o">.</span><span class="n">destroy</span>
</span><span class='line'>  <span class="k">else</span>
</span><span class='line'>    <span class="no">PostCategory</span><span class="o">.</span><span class="n">where</span><span class="p">(</span><span class="ss">post_id</span><span class="p">:</span> <span class="n">params</span><span class="o">[</span><span class="ss">:post_id</span><span class="o">]</span><span class="p">,</span> <span class="ss">category_id</span><span class="p">:</span> <span class="n">params</span><span class="o">[</span><span class="ss">:category_id</span><span class="o">]</span><span class="p">)</span><span class="o">.</span><span class="n">first</span><span class="o">.</span><span class="n">destroy</span>
</span><span class='line'>  <span class="k">end</span>
</span><span class='line'>
</span><span class='line'>  <span class="n">head</span> <span class="ss">:no_content</span>
</span><span class='line'><span class="k">end</span>
</span></code></pre></td></tr></table></div></figure>


<p>Note that the index action is a very succinct way of saying:</p>

<figure class='code'><figcaption><span></span></figcaption><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
<span class='line-number'>3</span>
<span class='line-number'>4</span>
<span class='line-number'>5</span>
</pre></td><td class='code'><pre><code class='ruby'><span class='line'><span class="k">def</span> <span class="nf">index</span>
</span><span class='line'>  <span class="n">post_categories</span> <span class="o">=</span> <span class="no">PostCategory</span><span class="o">.</span><span class="n">all</span>
</span><span class='line'>  <span class="n">post_categories</span> <span class="o">=</span> <span class="n">post_categories</span><span class="o">.</span><span class="n">where</span><span class="p">(</span><span class="ss">post_id</span><span class="p">:</span> <span class="n">params</span><span class="o">[</span><span class="ss">:post_id</span><span class="o">]</span><span class="p">)</span> <span class="k">if</span> <span class="n">params</span><span class="o">[</span><span class="ss">:post_id</span><span class="o">]</span>
</span><span class='line'>  <span class="n">post_categories</span> <span class="o">=</span> <span class="n">post_categories</span><span class="o">.</span><span class="n">where</span><span class="p">(</span><span class="ss">category_id</span><span class="p">:</span> <span class="n">params</span><span class="o">[</span><span class="ss">:category_id</span><span class="o">]</span><span class="p">)</span> <span class="k">if</span> <span class="n">params</span><span class="o">[</span><span class="ss">:category_id</span><span class="o">]</span>
</span><span class='line'><span class="k">end</span>
</span></code></pre></td></tr></table></div></figure>


<p>Though the latter has been known to drive me to very lengthy discussions on mutability morality and ethics.</p>

<p>This allows us to search against either posts or categories depending on the params, but this can only work if we cheat a bit around the routes and define a <code>DELETE</code> action on the root resource:</p>

<figure class='code'><figcaption><span></span></figcaption><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
</pre></td><td class='code'><pre><code class='ruby'><span class='line'><span class="n">delete</span> <span class="s1">&#39;/post_categories&#39;</span> <span class="o">=&gt;</span> <span class="s1">&#39;post_categories#destroy&#39;</span>
</span></code></pre></td></tr></table></div></figure>


<p>Not exactly the most straightforward method, but given the odd alternatives like adding controller actions to either of the ends of the relation like <code>post#add_category</code> and adding multiple routes for every time you try it I far and prefer this idea. The only real difference is that you end up with a request like this instead:</p>

<figure class='code'><figcaption><span></span></figcaption><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
</pre></td><td class='code'><pre><code class='ruby'><span class='line'><span class="no">DELETE</span> <span class="n">mysite</span><span class="o">.</span><span class="n">com</span><span class="o">/</span><span class="n">post_categories?post_id</span><span class="o">=</span><span class="mi">1</span><span class="o">&amp;</span><span class="n">category_id</span><span class="o">=</span><span class="mi">1</span>
</span></code></pre></td></tr></table></div></figure>


<p>Now all we have to do are basic actions like on any other service and we&rsquo;re golden:</p>

<figure class='code'><figcaption><span></span></figcaption><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
<span class='line-number'>3</span>
<span class='line-number'>4</span>
<span class='line-number'>5</span>
<span class='line-number'>6</span>
<span class='line-number'>7</span>
<span class='line-number'>8</span>
<span class='line-number'>9</span>
</pre></td><td class='code'><pre><code class='javascript'><span class='line'><span class="c1">// Livescript</span>
</span><span class='line'><span class="nx">RestAngular</span><span class="p">.</span><span class="nx">all</span> <span class="err">\</span><span class="nx">post_categories</span> <span class="p">.</span><span class="nx">post</span>
</span><span class='line'>  <span class="nx">post_category</span><span class="o">:</span>
</span><span class='line'>    <span class="nx">post_id</span><span class="o">:</span> <span class="nx">$scope</span><span class="p">.</span><span class="nx">new_category</span><span class="p">.</span><span class="nx">post_id</span>
</span><span class='line'>    <span class="nx">category_id</span><span class="o">:</span> <span class="nx">$scope</span><span class="p">.</span><span class="nx">new_category</span><span class="p">.</span><span class="nx">category_id</span>
</span><span class='line'>
</span><span class='line'><span class="nx">RestAngular</span><span class="p">.</span><span class="nx">all</span> <span class="err">\</span><span class="nx">post_categories</span> <span class="p">.</span><span class="nx">remove</span>
</span><span class='line'>  <span class="nx">post_id</span><span class="o">:</span> <span class="nx">$scope</span><span class="p">.</span><span class="nx">new_category</span><span class="p">.</span><span class="nx">post_id</span>
</span><span class='line'>  <span class="nx">category_id</span><span class="o">:</span> <span class="nx">$scope</span><span class="p">.</span><span class="nx">new_category</span><span class="p">.</span><span class="nx">category_id</span>
</span></code></pre></td></tr></table></div></figure>


<p>Wrap it in a service and you&rsquo;re set to go. Just remember that the association tables are there for a reason, use them. Rely on too much rails magic and you&rsquo;ll end up burned thinking something&rsquo;s going to work.</p>

<p>I welcome any thoughts on how better to address such issues as this in the comments, I&rsquo;d love to hear your opinions!</p>
]]></content>
  </entry>
  
  <entry>
    <title type="html"><![CDATA[Izzy Hackery]]></title>
    <link href="http://www.baweaver.com/blog/2014/09/30/streaming-hackery/"/>
    <updated>2014-09-30T22:43:48-07:00</updated>
    <id>http://www.baweaver.com/blog/2014/09/30/streaming-hackery</id>
    <content type="html"><![CDATA[<p>In which I explain the gem Izzy</p>

<!-- more -->


<p>In the time I&rsquo;ve been off of writing on here, I&rsquo;ve had a bit of a stint of gem creation. We&rsquo;re going to cover a number of them in the coming week.</p>

<p>Some may say that monkeypatching is inherently evil, but I would tend to disagree.  An RPG serves a very tactical purpose when used correctly, but often times it can have rather unfortunate results in the hands of the untrained. Such is monkeypatching, something that should be viewed in a pragmatic sense rather than one of dogmatic vitriol. With that, let&rsquo;s take a look:</p>

<h2>Izzy</h2>

<p>Izzy got popular right after a <a href="http://rubyweekly.com/issues/180">Ruby Weekly post mentioned it</a> as a method of mitigating long conditionals. I made it for the express purpose of simplifying multiple conditionals on the same object into something more succinct.</p>

<p>Going off of what&rsquo;s in the README as far as order, we&rsquo;ll take a look into some of the inspiration and workings of each method.</p>

<h3>Matchers</h3>

<p>Matchers are methods that are checked against any of the attributes of an Object that includes Izzy. Let&rsquo;s say we have an instance of me made in a Person class implementing Izzy:</p>

<figure class='code'><figcaption><span></span></figcaption><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
</pre></td><td class='code'><pre><code class='ruby'><span class='line'><span class="n">brandon</span> <span class="o">=</span> <span class="no">Person</span><span class="o">.</span><span class="n">new</span><span class="p">(</span><span class="s1">&#39;brandon&#39;</span><span class="p">,</span> <span class="mi">24</span><span class="p">,</span> <span class="s1">&#39;m&#39;</span><span class="p">)</span>
</span></code></pre></td></tr></table></div></figure>


<p>Now it gets really tiresome to do something like this while trying to validate against this object:</p>

<figure class='code'><figcaption><span></span></figcaption><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
</pre></td><td class='code'><pre><code class='ruby'><span class='line'><span class="n">brandon</span><span class="o">.</span><span class="n">age</span> <span class="o">&gt;</span> <span class="mi">18</span> <span class="o">&amp;&amp;</span> <span class="n">brandon</span><span class="o">.</span><span class="n">name</span> <span class="o">=~</span> <span class="sr">/^br/</span> <span class="o">&amp;&amp;</span> <span class="n">brandon</span><span class="o">.</span><span class="n">gender</span> <span class="o">==</span> <span class="s1">&#39;m&#39;</span>
</span></code></pre></td></tr></table></div></figure>


<p>It seems repetitive and downright unnecessary to specify the object multiple times. Rails has a tendency to use hashes to create, query, and update object, so why not add some of that type of magic to validations?</p>

<figure class='code'><figcaption><span></span></figcaption><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
</pre></td><td class='code'><pre><code class='ruby'><span class='line'><span class="n">brandon</span><span class="o">.</span><span class="n">matches_all?</span> <span class="nb">name</span><span class="p">:</span> <span class="sr">/^br/</span><span class="p">,</span> <span class="ss">age</span><span class="p">:</span> <span class="o">-&gt;</span> <span class="n">a</span> <span class="p">{</span> <span class="n">a</span> <span class="o">&gt;</span> <span class="mi">18</span> <span class="p">},</span> <span class="ss">gender</span><span class="p">:</span> <span class="s1">&#39;m&#39;</span>
</span></code></pre></td></tr></table></div></figure>


<p>To me that&rsquo;s far more succinct. So how do we make something like this in Ruby? Let&rsquo;s take a look at the source:</p>

<figure class='code'><figcaption><span></span></figcaption><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
<span class='line-number'>3</span>
<span class='line-number'>4</span>
<span class='line-number'>5</span>
<span class='line-number'>6</span>
<span class='line-number'>7</span>
<span class='line-number'>8</span>
<span class='line-number'>9</span>
<span class='line-number'>10</span>
<span class='line-number'>11</span>
<span class='line-number'>12</span>
<span class='line-number'>13</span>
<span class='line-number'>14</span>
<span class='line-number'>15</span>
<span class='line-number'>16</span>
<span class='line-number'>17</span>
<span class='line-number'>18</span>
<span class='line-number'>19</span>
<span class='line-number'>20</span>
<span class='line-number'>21</span>
</pre></td><td class='code'><pre><code class='ruby'><span class='line'><span class="k">def</span> <span class="nf">matches_all?</span><span class="p">(</span><span class="n">matchers</span> <span class="o">=</span> <span class="p">{})</span>
</span><span class='line'>  <span class="n">matchers</span><span class="o">.</span><span class="n">all?</span> <span class="o">&amp;</span><span class="n">matcher_check</span><span class="p">(</span><span class="ss">:all?</span><span class="p">)</span>
</span><span class='line'><span class="k">end</span>
</span><span class='line'>
</span><span class='line'><span class="k">def</span> <span class="nf">matches_any?</span><span class="p">(</span><span class="n">matchers</span> <span class="o">=</span> <span class="p">{})</span>
</span><span class='line'>  <span class="n">matchers</span><span class="o">.</span><span class="n">any?</span> <span class="o">&amp;</span><span class="n">matcher_check</span><span class="p">(</span><span class="ss">:any?</span><span class="p">)</span>
</span><span class='line'><span class="k">end</span>
</span><span class='line'>
</span><span class='line'><span class="k">def</span> <span class="nf">matches_none?</span><span class="p">(</span><span class="n">matchers</span> <span class="o">=</span> <span class="p">{})</span>
</span><span class='line'>  <span class="n">matchers</span><span class="o">.</span><span class="n">none?</span> <span class="o">&amp;</span><span class="n">matcher_check</span><span class="p">(</span><span class="ss">:any?</span><span class="p">)</span>
</span><span class='line'><span class="k">end</span>
</span><span class='line'>
</span><span class='line'><span class="kp">private</span>
</span><span class='line'>
</span><span class='line'><span class="k">def</span> <span class="nf">matcher_check</span><span class="p">(</span><span class="n">type</span> <span class="o">=</span> <span class="ss">:all?</span><span class="p">)</span>
</span><span class='line'>  <span class="o">-&gt;</span> <span class="n">matcher</span> <span class="p">{</span>
</span><span class='line'>    <span class="n">m</span><span class="p">,</span> <span class="n">val</span> <span class="o">=</span> <span class="o">*</span><span class="n">matcher</span>
</span><span class='line'>    <span class="n">values</span> <span class="o">=</span> <span class="n">val</span><span class="o">.</span><span class="n">is_a?</span><span class="p">(</span><span class="nb">Array</span><span class="p">)</span> <span class="p">?</span> <span class="n">val</span> <span class="p">:</span> <span class="nb">Array</span><span class="o">[</span><span class="n">val</span><span class="o">]</span>
</span><span class='line'>    <span class="n">values</span><span class="o">.</span><span class="n">send</span><span class="p">(</span><span class="n">type</span><span class="p">)</span> <span class="p">{</span> <span class="o">|</span><span class="n">v</span><span class="o">|</span> <span class="n">v</span> <span class="o">===</span> <span class="nb">self</span><span class="o">.</span><span class="n">send</span><span class="p">(</span><span class="n">m</span><span class="p">)</span> <span class="p">}</span>
</span><span class='line'>  <span class="p">}</span>
</span><span class='line'><span class="k">end</span>
</span></code></pre></td></tr></table></div></figure>


<p>The first thing you may notice is that the body of the block is abstracted into a private matcher_check. This is to abstract the logic for reuse on the two other matcher types.</p>

<p>The fun thing about this is because it&rsquo;s in a method, we can send another argument to it. The value is then pulled into the block, or closure if you prefer. In this case, we&rsquo;re sending what to check the values against dynamically. Notice that we only use all on the <code>matches_all?</code> method.</p>

<p>Let&rsquo;s step through this piece by piece with only using the check <code>brandon.matches_all? name: /^br/</code>:</p>

<figure class='code'><figcaption><span></span></figcaption><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
<span class='line-number'>3</span>
<span class='line-number'>4</span>
<span class='line-number'>5</span>
<span class='line-number'>6</span>
<span class='line-number'>7</span>
<span class='line-number'>8</span>
<span class='line-number'>9</span>
<span class='line-number'>10</span>
<span class='line-number'>11</span>
<span class='line-number'>12</span>
<span class='line-number'>13</span>
<span class='line-number'>14</span>
<span class='line-number'>15</span>
<span class='line-number'>16</span>
<span class='line-number'>17</span>
<span class='line-number'>18</span>
<span class='line-number'>19</span>
</pre></td><td class='code'><pre><code class='ruby'><span class='line'><span class="c1"># Call matches_all? on brandon:</span>
</span><span class='line'><span class="n">matchers</span> <span class="o">=</span> <span class="p">{</span><span class="nb">name</span><span class="p">:</span> <span class="sr">/^br/</span><span class="p">}</span>
</span><span class='line'>
</span><span class='line'><span class="n">matchers</span><span class="o">.</span><span class="n">all?</span> <span class="o">&amp;</span><span class="n">matcher_check</span><span class="p">(</span><span class="ss">:all?</span><span class="p">)</span>
</span><span class='line'>
</span><span class='line'><span class="c1"># matcher_check</span>
</span><span class='line'><span class="n">type</span> <span class="o">=</span> <span class="ss">:all?</span>
</span><span class='line'>
</span><span class='line'><span class="c1"># Hash gets exploded into the method and the value to check against</span>
</span><span class='line'><span class="n">m</span><span class="p">,</span> <span class="n">val</span> <span class="o">=</span> <span class="o">[</span><span class="ss">:name</span><span class="p">,</span> <span class="sr">/^br/</span><span class="o">]</span>
</span><span class='line'>
</span><span class='line'><span class="c1"># Since we&#39;re able to check against multiple conditions using an array, we want to make sure we have one to work with:</span>
</span><span class='line'><span class="n">values</span> <span class="o">=</span> <span class="nb">Array</span><span class="o">[</span><span class="sr">/^br/</span><span class="o">]</span>
</span><span class='line'>
</span><span class='line'><span class="c1"># We then check the values array with :all?, or :any? in the case of any and none checks.</span>
</span><span class='line'><span class="n">values</span><span class="o">.</span><span class="n">send</span><span class="p">(</span><span class="ss">:all?</span><span class="p">)</span> <span class="o">.</span><span class="n">.</span><span class="o">.</span>
</span><span class='line'>
</span><span class='line'><span class="c1"># which will use === to check it against the actual value:</span>
</span><span class='line'><span class="sr">/^br/</span> <span class="o">===</span> <span class="nb">self</span><span class="o">.</span><span class="n">send</span><span class="p">(</span><span class="ss">:name</span><span class="p">)</span>
</span></code></pre></td></tr></table></div></figure>


<p>So why does <code>===</code> work there you might wonder. It&rsquo;s overridden very frequently for classes, notably for Regex (matches), Range (includes), and Proc (call). Most of the time this is bad practice not to use the longhand versions, but in this case it affords us a great deal of flexibility not to worry about how it&rsquo;s evaluated as long as it does a proper match.</p>

<p>This is actually one of the most powerful features in the case statement, which uses <code>===</code> for its <code>when</code> clauses. Notice that Proc.call is the same as Proc.===, meaning you can throw lambda and friends into the mix for even more powerful checks.</p>

<h3>Boolean Matchers</h3>

<p>Boolean matchers were the original method, again using the abstracted block:</p>

<figure class='code'><figcaption><span></span></figcaption><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
<span class='line-number'>3</span>
<span class='line-number'>4</span>
<span class='line-number'>5</span>
<span class='line-number'>6</span>
<span class='line-number'>7</span>
<span class='line-number'>8</span>
<span class='line-number'>9</span>
<span class='line-number'>10</span>
<span class='line-number'>11</span>
<span class='line-number'>12</span>
<span class='line-number'>13</span>
<span class='line-number'>14</span>
<span class='line-number'>15</span>
<span class='line-number'>16</span>
<span class='line-number'>17</span>
</pre></td><td class='code'><pre><code class='ruby'><span class='line'><span class="k">def</span> <span class="nf">all_of?</span><span class="p">(</span><span class="o">*</span><span class="nb">methods</span><span class="p">)</span>
</span><span class='line'>  <span class="nb">methods</span><span class="o">.</span><span class="n">all?</span> <span class="o">&amp;</span><span class="n">method_check</span>
</span><span class='line'><span class="k">end</span>
</span><span class='line'>
</span><span class='line'><span class="k">def</span> <span class="nf">any_of?</span><span class="p">(</span><span class="o">*</span><span class="nb">methods</span><span class="p">)</span>
</span><span class='line'>  <span class="nb">methods</span><span class="o">.</span><span class="n">any?</span> <span class="o">&amp;</span><span class="n">method_check</span>
</span><span class='line'><span class="k">end</span>
</span><span class='line'>
</span><span class='line'><span class="k">def</span> <span class="nf">none_of?</span><span class="p">(</span><span class="o">*</span><span class="nb">methods</span><span class="p">)</span>
</span><span class='line'>  <span class="nb">methods</span><span class="o">.</span><span class="n">none?</span> <span class="o">&amp;</span><span class="n">method_check</span>
</span><span class='line'><span class="k">end</span>
</span><span class='line'>
</span><span class='line'><span class="kp">private</span>
</span><span class='line'>
</span><span class='line'><span class="k">def</span> <span class="nf">method_check</span>
</span><span class='line'>  <span class="o">-&gt;</span> <span class="n">m</span> <span class="p">{</span> <span class="nb">self</span><span class="o">.</span><span class="n">send</span><span class="p">(</span><span class="n">m</span><span class="p">)</span> <span class="p">}</span>
</span><span class='line'><span class="k">end</span>
</span></code></pre></td></tr></table></div></figure>


<p>This one is far simpler in that all it does is call a list of methods on an object, checking their truthfulness. If we had some methods defined on person to check legal status, or various other simple checks, this would come in handy:</p>

<figure class='code'><figcaption><span></span></figcaption><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
</pre></td><td class='code'><pre><code class='ruby'><span class='line'><span class="n">brandon</span><span class="o">.</span><span class="n">all_of?</span> <span class="ss">:legal?</span><span class="p">,</span> <span class="ss">:older_than_21?</span><span class="p">,</span> <span class="ss">:male?</span>
</span></code></pre></td></tr></table></div></figure>


<h3>Enumerable Module</h3>

<p>Because sometimes it&rsquo;s nice to have a bit of that Rails feel in regular Ruby. These methods use the <code>matches_all?</code> method in conjunction with <code>select</code>, <code>reject</code>, and <code>find</code> to provide some Rails like shorthand:</p>

<figure class='code'><figcaption><span></span></figcaption><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
<span class='line-number'>3</span>
<span class='line-number'>4</span>
<span class='line-number'>5</span>
<span class='line-number'>6</span>
<span class='line-number'>7</span>
<span class='line-number'>8</span>
<span class='line-number'>9</span>
<span class='line-number'>10</span>
<span class='line-number'>11</span>
</pre></td><td class='code'><pre><code class='ruby'><span class='line'><span class="k">def</span> <span class="nf">select_where</span><span class="p">(</span><span class="n">matchers</span> <span class="o">=</span> <span class="p">{})</span>
</span><span class='line'>  <span class="nb">self</span><span class="o">.</span><span class="n">select</span> <span class="p">{</span> <span class="o">|</span><span class="n">s</span><span class="o">|</span> <span class="n">s</span><span class="o">.</span><span class="n">matches_all?</span> <span class="n">matchers</span> <span class="p">}</span>
</span><span class='line'><span class="k">end</span>
</span><span class='line'>
</span><span class='line'><span class="k">def</span> <span class="nf">reject_where</span><span class="p">(</span><span class="n">matchers</span> <span class="o">=</span> <span class="p">{})</span>
</span><span class='line'>  <span class="nb">self</span><span class="o">.</span><span class="n">reject</span> <span class="p">{</span> <span class="o">|</span><span class="n">s</span><span class="o">|</span> <span class="n">s</span><span class="o">.</span><span class="n">matches_all?</span> <span class="n">matchers</span> <span class="p">}</span>
</span><span class='line'><span class="k">end</span>
</span><span class='line'>
</span><span class='line'><span class="k">def</span> <span class="nf">find_where</span><span class="p">(</span><span class="n">matchers</span> <span class="o">=</span> <span class="p">{})</span>
</span><span class='line'>  <span class="nb">self</span><span class="o">.</span><span class="n">find</span> <span class="p">{</span> <span class="o">|</span><span class="n">s</span><span class="o">|</span> <span class="n">s</span><span class="o">.</span><span class="n">matches_all?</span> <span class="n">matchers</span> <span class="p">}</span>
</span><span class='line'><span class="k">end</span>
</span></code></pre></td></tr></table></div></figure>


<p>We&rsquo;re not always in Rails, and one of my favorite features are the ActiveRecord <code>where</code> and <code>find</code> methods. Composing the two functions allows us to do that quite nicely.</p>

<h2>Final Notes</h2>

<p>Combining multiple small functions into something larger is one of the cornerstones of functional programming known as composition, and something well worth looking into. Not every gem has to be a monolithic beast that can tame the worlds problems. Sometimes you only need to do the simple things well and build up from there.</p>

<p>Next up we&rsquo;ll look into <a href="https://github.com/baweaver/streamable">Streamable</a>, <a href="https://github.com/baweaver/pipeable">Pipeable</a>, <a href="https://github.com/banister/funkify">@banister&rsquo;s Funkify Library</a>, and hacking Piping functionality onto Ruby.</p>
]]></content>
  </entry>
  
  <entry>
    <title type="html"><![CDATA[Setting up Rails in Debian 7]]></title>
    <link href="http://www.baweaver.com/blog/2013/10/02/setting-up-rails-in-debian-7/"/>
    <updated>2013-10-02T22:15:00-07:00</updated>
    <id>http://www.baweaver.com/blog/2013/10/02/setting-up-rails-in-debian-7</id>
    <content type="html"><![CDATA[<p>In this tutorial we&rsquo;ll cover the entire process of setting up a basic
Rails environment on a clean install of Debian 7.1.</p>

<!-- more -->


<h2>What am I making?</h2>

<p>You will be making a Debian 7.1 Box with Ruby 2.0, Rails 4.0, and Git.
At the time of this writing, these are the most recent versions.</p>

<h2>Virtual Box</h2>

<p>The first thing we&rsquo;re going to need is Virtual Box. Feel free to set
this up as a standalone OS, the process will essentially be the same.</p>

<p>In my case, we just need:</p>

<figure class='code'><figcaption><span></span></figcaption><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
</pre></td><td class='code'><pre><code class='bash'><span class='line'>sudo apt-get install virtualbox
</span></code></pre></td></tr></table></div></figure>


<h2>Getting Debian</h2>

<p><a href="http://www.debian.org/">http://www.debian.org/</a></p>

<h2>Installing Debian</h2>

<p>Unless otherwise noted, specify the default options on your install. I
will note the steps as I go along installing Debian 7.1 i386 on an
instance of Virtual Box with a Host OS of Linux Mint 14 x64. The steps
should not differ heavily with other Host OS platforms.</p>

<h3>Hostname and Domainname</h3>

<p>Your host and domain names are completely up to you, but if this is just
a test I would suggest leaving them as the defaults for the time being.
They can be changed later on.</p>

<h3>User Accounts</h3>

<p>The same will apply to the passwords and the other information
used for account setups. At this point on a test box I specify a trivial
password and other information, considering I&rsquo;m installing on a VM that
will not see the light of day. I don&rsquo;t advocate doing such things on a
live server, the Ops will hit you or do nasty things to your home
directory if you do.</p>

<h3>Partitioning</h3>

<p>Select Guided for the partitioning method, unless you&rsquo;re feeling brave
or know your way around Unix. This will be explained in detail in a
later tutorial, but for now it will be fine to accept the defaults.</p>

<p>As it will tell you, select all on same partition. There are quite a few
benefits towards seperate partitions, but if this is a test box or a VM
it will be irrelevant for now.</p>

<p>Finish the partitioning and write the changes onto the disk.</p>

<h3>Base System Install</h3>

<p>After this point, the base system will begin to install. Now would be an
ideal time for coffee or other niceties you may desire as it will take
about 5-10 mintues to complete.</p>

<h3>Configuring Apt</h3>

<p>This is another instance of selecting defaults unless you have
compelling reason not to. Chances are low that you will, and HTTP
proxies will be rare in most cases considering you&rsquo;d be routing through
your Host&rsquo;s NIC.</p>

<h3>Select and Install Software</h3>

<p>Now would be another great time to catch a break, as it&rsquo;s going to be
downloading a fair amount of packages from the package server. Make sure
to watch for the popularity contest prompt. Feel free to select as you
wish.</p>

<p>On the packages list, you can deselect using the space bar. ONLY select
SSH Server and Standard System Utilities. We want to keep this
lightweight for tests. In the case of a server, DO NOT select a Desktop
environment. Put simply, you&rsquo;re doing yourself a disservice as most
commercial servers will be running headless as is. Press Enter to
continue, and it will continue to retrieve the requested files.</p>

<p>Now that we&rsquo;re here, it&rsquo;s time to boot into our new system!</p>

<h2>Getting Rails Set Up</h2>

<p>Go ahead and log into our test account. Now the first thing we&rsquo;re going
to want to get a hold of are a few programs:</p>

<figure class='code'><figcaption><span>Programs</span></figcaption><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
</pre></td><td class='code'><pre><code class='bash'><span class='line'>sudo apt-get install git zsh vim
</span></code></pre></td></tr></table></div></figure>


<p>ZSH and Vim being preference, but will save you some headaches later on
down the road. Git is by far manditory for any form of Rails
Development. Learn Version Control, it will save you countless hours
later on.</p>

<p>Next we&rsquo;re going to want to get a hold of RVM, Ruby Version Manager, to
handle various Ruby installations.</p>

<figure class='code'><figcaption><span>RVM</span></figcaption><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
<span class='line-number'>3</span>
<span class='line-number'>4</span>
</pre></td><td class='code'><pre><code class='bash'><span class='line'>curl -L https://get.rvm.io | bash
</span><span class='line'><span class="nb">source</span> /etc/profile.d/rvm.sh
</span><span class='line'>
</span><span class='line'>rvm install 2.0
</span></code></pre></td></tr></table></div></figure>


<p>Notice the source command, you won&rsquo;t be getting very far without it.
This will take some time as it&rsquo;s building Ruby from source. Now to get
Rails running for us.</p>

<figure class='code'><figcaption><span>Rails Install</span></figcaption><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
</pre></td><td class='code'><pre><code class='bash'><span class='line'>gem install rails -v 4.0 --no-rdoc --no-ri
</span></code></pre></td></tr></table></div></figure>


<p>We&rsquo;re explicitly leaving off the documentation, as it takes
substantially longer to compile. The thought behind this is that you
should have a hold of the great Obie Fernandez&rsquo;s <a href="https://leanpub.com/tr4w">The Rails 4 Way</a> sitting on your
desk. No? Purchase it. I&rsquo;ll wait, and you have plenty of time before
Rails installs as well.</p>

<h2>Testing it out</h2>

<p>Now we&rsquo;ll get a skeleton app up to demonstrate that we have everything
working. Make a directory for tests, and run</p>

<figure class='code'><figcaption><span></span></figcaption><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
</pre></td><td class='code'><pre><code class='bash'><span class='line'>rails new <span class="nb">test</span>-app
</span></code></pre></td></tr></table></div></figure>


<p>You should see a lot of code flash by, and a hang at bundle install.
This is retrieving all the extra libraries for Rails to get running.</p>

<p>I will warn you there&rsquo;s a potentially nasty bug lurking here, in that a
javascript environment will need to be installed, run the rollowing
commands:</p>

<figure class='code'><figcaption><span>NodeJS Install</span></figcaption><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
<span class='line-number'>3</span>
<span class='line-number'>4</span>
<span class='line-number'>5</span>
<span class='line-number'>6</span>
</pre></td><td class='code'><pre><code class='bash'><span class='line'>sudo apt-get update
</span><span class='line'>sudo apt-get install python-software-properties python g++ make
</span><span class='line'>sudo add-apt-repository ppa:chris-lea/node.js
</span><span class='line'>sudo apt-get update
</span><span class='line'>
</span><span class='line'>sudo apt-get install nodejs
</span></code></pre></td></tr></table></div></figure>


<p>After this, go ahead and give it a shot and watch it come to life!</p>

<figure class='code'><figcaption><span>Rails Server</span></figcaption><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
</pre></td><td class='code'><pre><code class='bash'><span class='line'>rails s
</span></code></pre></td></tr></table></div></figure>


<p>Running into problems? Shoot me a tweet @keystonelemur and I&rsquo;ll add it to a footer
section of problems encountered and we&rsquo;ll get it all sorted out!</p>
]]></content>
  </entry>
  
  <entry>
    <title type="html"><![CDATA[Getting Cozy with the Command Line]]></title>
    <link href="http://www.baweaver.com/blog/2013/09/29/getting-cozy-with-the-command-line/"/>
    <updated>2013-09-29T00:50:00-07:00</updated>
    <id>http://www.baweaver.com/blog/2013/09/29/getting-cozy-with-the-command-line</id>
    <content type="html"><![CDATA[<p>To some, the command line is a truly frightening beast. To be fair, when
I began, it really was. Who in the world would ever want to sit around
in a prompt when there are such beautiful visual editors out there? It
seems so counterintuitive that no one should ever want to go that way.</p>

<p>Yet here we are. The great bearded ones hammering away in their prompts,
invoking vim wizardry, emacs enigmas, and unix hackery. What makes them
so cozy?</p>

<!-- more -->


<h2>It Will Hurt</h2>

<p>When I was getting started into technology, a good friend and mentor of
mine recommended I install OpenBSD. I installed it, and the first words
out of my mouth were &lsquo;Where&rsquo;s the GUI!?&rsquo;</p>

<p>I&rsquo;d never been closer to throwing a computer out a window than trying to
figure that thing out. It was horrible, I was slow, and nothing made
sense. I kept projecting my expectations for an OS onto it, proclaiming
loudly how worthless it was and why it was so stupid.</p>

<p>After some coaxing, my friend told me to wait it out, read a few books
on the subject, and bear through it. If there was a single moment that
changed everything I&rsquo;d ever known on technology, this would have been
it.</p>

<p>The best advice I can give to a newbie to the great prompt is that man
pages are your friend, google is an infinite purveyor of knowledge, and
amazon holds within it great archives of literature waiting to be
discovered. Read, and find a basic Linux administration guide such as
<a href="http://www.amazon.com/Linux-Administration-Beginners-Guide-Soyinka/dp/0071767584/">Linux Administration - A Beginner&rsquo;s
Guide</a>.</p>

<p>The commands you are going to want to know inside and out are Awk, Sed,
Grep, and Find. They will make your experiences with log files and other
text files far more enjoyable.</p>

<h2>ZSH</h2>

<p>The single best thing you can do for your prompt is to install ZSH, and
shortly thereafter Oh My ZSH. Any knowledge you have of BASH will be
quickly transferrable, as it will all already work in ZSH.</p>

<p>ZSH offers quite a few little niceties that will speed up your work flow
immensely.</p>

<ul>
<li>Command Correction - Type in the wrong command? It&rsquo;ll ask you what you
meant.</li>
<li>Tab Completion - Press Tab in an empty directory and you get a list of
all files to cycle through, and as you type it will start a fuzzy search.</li>
<li>Git Integration - cd into a Git Repo, and it will tell you your
branch, and give you a number of aliases to shorten git work.</li>
<li>Shared History - Command in the wrong shell? Not a problem with shared
history</li>
</ul>


<p>There are so many more things that I could discuss on ZSH, but there are
<a href="http://mikegrouchy.com/blog/2012/01/zsh-is-your-friend.html">plenty</a>
<a href="http://www.slideshare.net/jaguardesignstudio/why-zsh-is-cooler-than-your-shell-16194692">of</a> <a href="http://fendrich.se/blog/2012/09/28/no/">reasons</a>.</p>

<h2>Aliases</h2>

<p>As I mentioned in an earlier post, aliases are your friend. Anything I
type more than once that&rsquo;s greater than five characters will get an
alias. Combine with ZSH features such as global and suffix and you can get some crazy commands
going fast.</p>

<h2>VIM</h2>

<p>Vim was probably the biggest learning curve I had when switching to a
command prompt based layout, and also by far the most rewarding when I
really got it. Heck, I&rsquo;m writing this post in Vim.</p>

<p>The biggest advantage to it is that your hands <strong>never</strong> have to leave
the keyboard. The mouse has become an enemy to productivity to me, and I
refuse to touch it when programming. Learning shortcuts and how to use
vim properly has sped up my programming substantially.</p>

<p>Combined with any number of the <a href="https://github.com/tpope">Great Master Tim Pope&rsquo;s
Plugins</a> and
VIM will be a match for much of any editor out there.</p>

<p>The real question to ask on matters of efficiency is this: When was the
last time you watched someone in Sublime or Textmate programming and
thought &lsquo;Wow!&rsquo; ? Go watch a Vim guru fly, and you&rsquo;ll swear you just
witnessed black magic.</p>

<h2>EMACS</h2>

<p>I can&rsquo;t mention Vim without mentioning Emacs, lest I invoke a Holy War.
Emacs is a beast all its own, and the only apt description of it would
be an Operating System pretending to be a Text Editor. Seriously, IRC
and a Music player? It undoubtably has some substantial power, but
ultimately it clashed with my desire for a streamlined workflow.</p>

<p>Perhaps I&rsquo;ll come back to this after I start back into LISP and go into
Emacs, but for now it&rsquo;s not my thing.</p>

<h2>TMUX</h2>

<p>TMUX is a Terminal Multiplexer. But simply, it allows you to have
multiple panes open in a single terminal window. Combine that with the
ability to save your sessions, and make templates for new sections and
it will quickly become valuable.</p>

<p>Admittedly I have not had as much of a chance as I would have liked to
to experiment with it, but it is definitely worth a look.</p>

<h2>But Why?</h2>

<p>I switched to an almost completely terminal based workflow for one
reason in the end: efficiency. I&rsquo;m notoriously irratable with repetition
of anything, and anything that allows me to remove repetition is worth
the effort.</p>

<p>That, and it is always nice to have a new programmer accuse you of black magic
hackery after seeing you do anything.</p>
]]></content>
  </entry>
  
  <entry>
    <title type="html"><![CDATA[The Functional side of Ruby]]></title>
    <link href="http://www.baweaver.com/blog/2013/09/28/the-functional-side-of-ruby/"/>
    <updated>2013-09-28T11:08:00-07:00</updated>
    <id>http://www.baweaver.com/blog/2013/09/28/the-functional-side-of-ruby</id>
    <content type="html"><![CDATA[<p>Many people come into Ruby from a C-based language background, and are
quick to use only what they really feel comfortable with that has a
direct parallel in their language of choice. Doing so, you miss out on
all types of wonderful features of Ruby, and in this post we&rsquo;ll cover a
few of them.</p>

<!-- more -->


<h2>Functional?</h2>

<p><img src="http://imgs.xkcd.com/comics/functional.png" alt="XKCD 1270" /></p>

<p>If you&rsquo;re like me, you&rsquo;ve heard this word thrown around more than
anything, and never really defined. Everyone sings praises of this great
new renaissance of programming, but no one seems to know what it even
is.</p>

<p>In its simplest terms, functional programming is a program based on
functions. Functions return values, mutation is a naughty word, and the
law of the land is no side effects.</p>

<p>Well that sounds all well and good, but how exactly can you program if
all variables are in their final state? That seems rather
counterintuitive at best, and confoundedly stupid at worst. So why then?</p>

<h2>First Class Citizens</h2>

<p>In languages that support functional programming style, all functionas
are first class citizens. This means that they can be passed themselves
as arguments the same as any other value, because by their definition
they return a value.</p>

<p>With Ruby, every function returns a value, whether implicitly or
explicitly. Let&rsquo;s see what we mean here:</p>

<figure class='code'><figcaption><span>First Class Functions</span></figcaption><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
<span class='line-number'>3</span>
<span class='line-number'>4</span>
<span class='line-number'>5</span>
<span class='line-number'>6</span>
<span class='line-number'>7</span>
<span class='line-number'>8</span>
<span class='line-number'>9</span>
<span class='line-number'>10</span>
</pre></td><td class='code'><pre><code class='ruby'><span class='line'><span class="k">def</span> <span class="nf">bob</span>
</span><span class='line'>  <span class="s2">&quot;my name is Bob!&quot;</span>
</span><span class='line'><span class="k">end</span>
</span><span class='line'>
</span><span class='line'><span class="k">def</span> <span class="nf">hello</span><span class="p">(</span><span class="n">message</span><span class="p">)</span>
</span><span class='line'>  <span class="nb">puts</span> <span class="s2">&quot;Hello, </span><span class="si">#{</span><span class="n">message</span><span class="si">}</span><span class="s2">&quot;</span>
</span><span class='line'><span class="k">end</span>
</span><span class='line'>
</span><span class='line'><span class="n">hello</span> <span class="n">bob</span>
</span><span class='line'>  <span class="c1"># =&gt; Hello, my name is Bob!</span>
</span></code></pre></td></tr></table></div></figure>


<p>We just passed a function as a value! This opens up a lot of interesting
possibilities, which brings us to our next point.</p>

<h2>Anonymous Functions</h2>

<p>Anonymous functions are functions without a name. This may sound
strangely foreign, but if you&rsquo;ve ever touched javascript you might
recognize this pattern:</p>

<figure class='code'><figcaption><span>Anonymous Functions in Javascript</span></figcaption><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
<span class='line-number'>3</span>
</pre></td><td class='code'><pre><code class='javascript'><span class='line'><span class="kd">var</span> <span class="nx">square</span> <span class="o">=</span> <span class="kd">function</span> <span class="p">(</span><span class="nx">x</span><span class="p">){</span>
</span><span class='line'>  <span class="nx">alert</span><span class="p">(</span><span class="nx">x</span><span class="o">*</span><span class="nx">x</span><span class="p">);</span>
</span><span class='line'><span class="p">}</span>
</span></code></pre></td></tr></table></div></figure>


<p>You may notice that we just set a variable equal to a function, or more
correctly that we just named an anonymous function. So where did this
come from? Let&rsquo;s take a look at the same thing in Scheme:</p>

<figure class='code'><figcaption><span>Anonymous Functions in Scheme</span></figcaption><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
</pre></td><td class='code'><pre><code class='scheme'><span class='line'><span class="p">(</span><span class="k">define </span><span class="nv">square</span> <span class="p">(</span><span class="k">lambda </span><span class="p">(</span><span class="nf">x</span><span class="p">)</span> <span class="p">(</span><span class="nb">* </span><span class="nv">x</span> <span class="nv">x</span><span class="p">)))</span>
</span></code></pre></td></tr></table></div></figure>


<p>This type of pattern is extremely common in LISP like languages, which
is why some readers are going to start noticing some striking
similarities to Ruby at this point. Let&rsquo;s give this one more try in
Ruby:</p>

<figure class='code'><figcaption><span>Anonymous Functions in Ruby</span></figcaption><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
</pre></td><td class='code'><pre><code class='ruby'><span class='line'><span class="n">square</span> <span class="o">=</span> <span class="nb">lambda</span><span class="p">(</span><span class="n">x</span><span class="p">){</span> <span class="n">x</span> <span class="o">*</span> <span class="n">x</span> <span class="p">}</span>
</span><span class='line'><span class="n">square</span> <span class="o">=</span> <span class="o">-&gt;</span><span class="p">(</span><span class="n">x</span><span class="p">){</span> <span class="n">x</span> <span class="o">*</span> <span class="n">x</span> <span class="p">}</span> <span class="c1"># Ruby 1.9+ Syntax</span>
</span></code></pre></td></tr></table></div></figure>


<p>Blocks are essentially anonymous functions that are called on the fly to
operate on enumerator values, and discarded. Blocks can also be saved if
need be, which brings us to</p>

<h2>Why Bother?</h2>

<p>What benefits does it really bring? Is it even worth it? In short, the
authors (probably biased) opinion is yes. The key reason to this is
idempotence.</p>

<p>You see, idempotence is a complicated word that essentially means that
no matter how many times you run a function, given the same input it
will <em>always</em> return the same output.</p>

<p>The benefit of this is that you don&rsquo;t have to worry about a mystical
black box and ordering scheme, as well as necessary blood sacrifices in
order to get a unit test to pass. You know for a fact that a function
will return the same every single time. That, my friends, will save you
a great many nightmares down the road.</p>

<p>The great thing about idempotence is it translates almost directly into
thread safe methods that will not do unusual things to your values if
written correctly. Functional languages thrive in multi-threaded
and distributed environments, just look at Erlang.</p>

<p>Erlang was a language invented by Sony Ericsson in order to manage their
massive phone distributions. They came across a hairy question, how do
we update our phone network <strong>and</strong> ensure no down time? Enter Erlang
with its hot-swappable modules that could be changed out in production.
Functional languages and techniques can give you that type of power.</p>

<h2>The Good and the Bad</h2>

<p>So what constitutes good practice and bad practice? Let&rsquo;s dive into a
few examples shall we?</p>

<p>In string manipulation, modifying the original string will yield some
very nasty side effects very quickly.</p>

<figure class='code'><figcaption><span>String Manipulation</span></figcaption><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
<span class='line-number'>3</span>
<span class='line-number'>4</span>
</pre></td><td class='code'><pre><code class='ruby'><span class='line'><span class="n">str</span> <span class="o">=</span> <span class="s1">&#39;Hello, &#39;</span> <span class="o">+</span> <span class="n">str</span> <span class="c1"># BAD, we just mutated the variable! Running</span>
</span><span class='line'><span class="n">multiple</span> <span class="n">times</span> <span class="n">would</span> <span class="n">be</span> <span class="no">BAD</span> <span class="n">news</span><span class="o">.</span>
</span><span class='line'>
</span><span class='line'><span class="s2">&quot;Hello, </span><span class="si">#{</span><span class="n">str</span><span class="si">}</span><span class="s2">&quot;</span> <span class="c1"># GOOD, no mutation, just a return value</span>
</span></code></pre></td></tr></table></div></figure>


<p>Bang (!) methods should be used extremely rarely, as they modify the
sender. Instead, return the results to a new array.</p>

<figure class='code'><figcaption><span>Square an Array</span></figcaption><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
<span class='line-number'>3</span>
<span class='line-number'>4</span>
<span class='line-number'>5</span>
</pre></td><td class='code'><pre><code class='ruby'><span class='line'><span class="n">ary</span> <span class="o">=</span> <span class="o">[</span><span class="mi">1</span><span class="p">,</span><span class="mi">2</span><span class="p">,</span><span class="mi">3</span><span class="p">,</span><span class="mi">4</span><span class="p">,</span><span class="mi">5</span><span class="o">]</span>
</span><span class='line'>
</span><span class='line'><span class="n">ary</span><span class="o">.</span><span class="n">map!</span><span class="p">{</span> <span class="o">|</span><span class="n">i</span><span class="o">|</span> <span class="n">i</span> <span class="o">*</span> <span class="n">i</span> <span class="p">}</span> <span class="c1"># BAD, mutated array</span>
</span><span class='line'>
</span><span class='line'><span class="n">new_ary</span> <span class="o">=</span> <span class="n">ary</span><span class="o">.</span><span class="n">map</span><span class="p">{</span> <span class="o">|</span><span class="n">i</span><span class="o">|</span> <span class="n">i</span> <span class="o">*</span> <span class="n">i</span> <span class="p">}</span> <span class="c1"># GOOD</span>
</span></code></pre></td></tr></table></div></figure>


<p>This would be more amusing if I hadn&rsquo;t done it before when I started.
Read up on the Enumerable module, as it will save you immeasurable
amounts of time in the long run.</p>

<figure class='code'><figcaption><span>Select from an Array</span></figcaption><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
<span class='line-number'>3</span>
<span class='line-number'>4</span>
<span class='line-number'>5</span>
<span class='line-number'>6</span>
<span class='line-number'>7</span>
</pre></td><td class='code'><pre><code class='ruby'><span class='line'><span class="n">ary</span> <span class="o">=</span> <span class="sx">%w(hartnell troughton pertwee baker davison baker mccoy mcgann</span>
</span><span class='line'><span class="sx">eccleston tennant smith capaldi)</span>
</span><span class='line'>
</span><span class='line'><span class="n">new_ary</span> <span class="o">=</span> <span class="o">[]</span>
</span><span class='line'><span class="n">ary</span><span class="o">.</span><span class="n">each</span><span class="p">{</span> <span class="o">|</span><span class="nb">name</span><span class="o">|</span> <span class="n">new_ary</span> <span class="o">&lt;&lt;</span> <span class="nb">name</span><span class="o">.</span><span class="n">length</span> <span class="k">if</span> <span class="nb">name</span><span class="o">.</span><span class="n">length</span> <span class="o">&gt;</span> <span class="mi">5</span> <span class="p">}</span> <span class="c1"># BAD</span>
</span><span class='line'>
</span><span class='line'><span class="n">new_ary</span> <span class="o">=</span> <span class="n">ary</span><span class="o">.</span><span class="n">select</span><span class="p">{</span> <span class="o">|</span><span class="nb">name</span><span class="o">|</span> <span class="nb">name</span><span class="o">.</span><span class="n">length</span> <span class="o">&gt;</span> <span class="mi">5</span> <span class="p">}</span> <span class="c1"># GOOD</span>
</span></code></pre></td></tr></table></div></figure>


<p>Again with the things I wish I had never done. Iterators like this are
definitely not needed in a language like Ruby where practically
everything is an object. Again, learning the methods will save you a
lot.</p>

<figure class='code'><figcaption><span>Count Number of Records Processed</span></figcaption><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
<span class='line-number'>3</span>
<span class='line-number'>4</span>
<span class='line-number'>5</span>
<span class='line-number'>6</span>
<span class='line-number'>7</span>
<span class='line-number'>8</span>
</pre></td><td class='code'><pre><code class='ruby'><span class='line'><span class="n">ary</span> <span class="o">=</span> <span class="sx">%w(hartnell troughton pertwee baker davison baker mccoy mcgann</span>
</span><span class='line'><span class="sx">eccleston tennant smith capaldi)</span>
</span><span class='line'>
</span><span class='line'><span class="n">i</span> <span class="o">=</span> <span class="mi">0</span>
</span><span class='line'><span class="n">new_ary</span> <span class="o">=</span> <span class="n">ary</span><span class="o">.</span><span class="n">select</span><span class="p">{</span> <span class="o">|</span><span class="nb">name</span><span class="o">|</span> <span class="n">i</span><span class="o">++</span> <span class="k">if</span> <span class="nb">name</span><span class="o">.</span><span class="n">length</span> <span class="o">&gt;</span> <span class="mi">5</span><span class="p">;</span> <span class="nb">name</span><span class="o">.</span><span class="n">length</span> <span class="o">&gt;</span> <span class="mi">5</span> <span class="p">}</span> <span class="c1"># BAD</span>
</span><span class='line'>
</span><span class='line'><span class="n">new_ary</span> <span class="o">=</span> <span class="n">ary</span><span class="o">.</span><span class="n">select</span><span class="p">{</span> <span class="o">|</span><span class="nb">name</span><span class="o">|</span> <span class="nb">name</span><span class="o">.</span><span class="n">length</span> <span class="o">&gt;</span> <span class="mi">5</span> <span class="p">}</span>
</span><span class='line'><span class="n">new_ary</span><span class="o">.</span><span class="n">count</span> <span class="c1"># GOOD</span>
</span></code></pre></td></tr></table></div></figure>


<p>The amount of time that you will save by simply reading over the
Enumerable module, and learning the commands map, reduce, and select
will be astounding. All of which originated from a LISP like language.</p>

<h2>In the Wild</h2>

<p>So, this is a fairly short writeup on the subject, and I will definitely
cover it in more detail later on, but you should have a decent idea of
what to look for.</p>

<p>The thing to take away from this is that if used properly, unit tests
and making sure things behave as they should becomes exponentially
easier. Some of the hardest tasks in programming merely require a
different perspective.</p>
]]></content>
  </entry>
  
  <entry>
    <title type="html"><![CDATA[Automate it]]></title>
    <link href="http://www.baweaver.com/blog/2013/09/26/automate-it/"/>
    <updated>2013-09-26T21:51:00-07:00</updated>
    <id>http://www.baweaver.com/blog/2013/09/26/automate-it</id>
    <content type="html"><![CDATA[<p>The better programmer is not the one who flies across the keyboard,
generating hundreds of lines of code, but the one who has but a few
strokes that do the same work in half the effort.</p>

<!-- more -->


<h2>When to Automate</h2>

<ul>
<li> Did you use it more than twice in a day? Automate it.</li>
<li> Does it take more than five keystrokes? Alias it.</li>
<li> Are you repeating yourself? Automate it.</li>
<li> Did you just wonder if you should automate it? Do it.</li>
</ul>


<p>Automation seems to be a scary concept for some, a black magic that many
try and avoid because they already know all of their commands and
appreciate their vanilla editors.</p>

<h2>But it takes too much time!</h2>

<p><img src="http://imgs.xkcd.com/comics/the_general_problem.png" alt="XKCD 974" /></p>

<p>In some cases, yes, you are spending far more time automating something
than actually getting it done. Then again, really, how far and inbetween
are those cases that you can justify it all away with just that? XKCD,
as always, has our backs on timing it out:</p>

<p><img src="http://imgs.xkcd.com/comics/is_it_worth_the_time.png" alt="XKCD 1205" /></p>

<h2>Shells</h2>

<p>If you&rsquo;re in a Unix environment, your shell should be your best friend.
Know your way around a command prompt well enough and you&rsquo;ve already
made some serious headway in reducing the amount of time it takes to do
something!</p>

<p>I would seriously suggest taking a look into
<a href="http://mikegrouchy.com/blog/2012/01/zsh-is-your-friend.html">ZSH</a> and its&#8217; extension
<a href="https://github.com/robbyrussell/oh-my-zsh">Oh My ZSH!</a> as they alone will save you a lot of time in a shell
prompt.</p>

<h2>Aliases</h2>

<p>Now let&rsquo;s take a look at a few of the aliases I frequent:</p>

<figure class='code'><figcaption><span>Aliases</span></figcaption><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
<span class='line-number'>3</span>
<span class='line-number'>4</span>
<span class='line-number'>5</span>
<span class='line-number'>6</span>
</pre></td><td class='code'><pre><code class='sh'><span class='line'><span class="nb">alias </span><span class="nv">v</span><span class="o">=</span><span class="s2">&quot;vim&quot;</span>            <span class="c"># Shorten Vim</span>
</span><span class='line'><span class="nb">alias </span><span class="nv">vrc</span><span class="o">=</span><span class="s2">&quot;vim ~/.vimrc&quot;</span> <span class="c"># Edit my Vimrc</span>
</span><span class='line'><span class="nb">alias </span><span class="nv">vc</span><span class="o">=</span><span class="s2">&quot;vim .&quot;</span>         <span class="c"># Open the current directory in Vim</span>
</span><span class='line'>
</span><span class='line'><span class="nb">alias </span><span class="nv">vzpf</span><span class="o">=</span><span class="s2">&quot;vim ~/.zprofile&quot;</span>  <span class="c"># Edit my zprofile</span>
</span><span class='line'><span class="nb">alias </span><span class="nv">zsrc</span><span class="o">=</span><span class="s2">&quot;source ~/.zprofile&quot;</span> <span class="c"># Reload my zprofile</span>
</span></code></pre></td></tr></table></div></figure>


<p>These, of course, being pulled from my <a href="https://github.com/baweaver/special-sauce">Special Sauce
Repository</a>.</p>

<p>So what rule of thumb do I use when adding new aliases? If it takes more
than five keystrokes to do, I alias it. Digging into my .zprofile will
show you a most_used command which I have to keep me honest about how
much I use commands.</p>

<p>The amount of time I save from just that adds up quickly as I type many
of those commands several hundred times a day. Adding an alias, and
sourcing my .zprofile takes me all of five seconds to do.</p>

<h2>It adds up</h2>

<p>So what, we have a few niceties and aliases around. We may save five
minutes a day with a basic set. Perhaps, but the more you alias and the
more you start to chip away at your daily repetition, the more you will
realize that you&rsquo;re quickly outpacing your normal speeds.</p>

<h2>Be Lazy</h2>

<p>Really. Be lazy. Hate to repeat yourself so much that adding an alias is
a natural twitch. Hate doing things by hand so much that you crack open
your editor and start scripting it out!</p>

<p>Learn the keyboard shortcuts, and stop touching that mouse. If you&rsquo;re
really hardcore on it, learn Vim or Emacs and get to town on Macros and
keybindings.</p>

<h2>Automate it</h2>

<p>When in doubt, automate it and document it. Sharing is caring, and many
people are quite kind as to <a href="https://github.com/search?q=zprofile&amp;ref=cmdform&amp;type=Code">post their zprofiles</a>,
so take a peek and learn.</p>
]]></content>
  </entry>
  
</feed>
