Merging Dictionaries in a List Comprehension

I was reading a blog article of a fellow Python developer. He wants to update all dictionaries inside a list. In old versions of Python there isn't immediately a trivial way to solve this problem. The solution he made is pretty interesting as it contains a combination of list comprehensions, dictionaries and lambda functions. I tried to solve this issue by applying recent additions to the Python language to have a cleaner result.

Be sure to read the article at estebansastre.com

Merging dictionaries

As noted in the referred article, there isn't a standard way in python to return an updated dictionary. This is because unlike most other data types, dictionaries are mutable. When applying functional programming concepts at them, the result might not always be as expected.

See the Python code below.

a = {'a key': 'a value'}
b = {'b key': 'b value'}
print(a.update(b))  # returns None

When reading the code one would expect to print {'a key': 'a value', 'b key': 'b value'}. Why does it behave this way? Well, a dictionary is mutable. So the update() method updates the value in in memory. It returns None because this is a part of the Python internals.

The correct solution for this problem would be something like this:

a = {'a key': 'a value'}
b = {'b key': 'b value'}
a.update(b)  # None, the dictionary a is updated.
print(a)  # {'a key': 'a value', 'b key': 'b value'}

Python 3.5 introduced unpacking generalizations as defined in PEP 448. When applied to the previous example, the result is something like this:

a = {'a key': 'a value'}
b = {'b key': 'b value'}
print(dict(**a, **b))  # {'a key': 'a value', 'b key': 'b value'}

See how easy that was!

Applying unpacking generalizations

I figured out that the unpacking generalizations technique can be applied to the algorithm to update a list of dictionaries using list list comprehension. Let's rewrite the first algorithm:

[dict(**item, **ext) for item in original]

And apply this algorithm to the same examples:

original = [
    {'title': 'Oh boy'},
    {'title': 'Oh girl'}]

ext = {'meta': {'author': 'Myself'}}

print([dict(**item, **ext) for item in original])

# [{'title': 'Oh boy',
#   'meta': {'author': 'Myself'}},
#  {'title': 'Oh girl',
#   'meta': {'author': 'Myself'}}]

What's the catch? This example will probably only work in Python 3.5 or higher. I've tested this code in Python 3.6. Yet another good reason to move away from legacy Python 2 to a recent Python 3 release.

Full code

The full Python 3.6 (or higher) code that can be run locally and tweaked accordingly.

import unittest
import copy
from typing import List

ListOfDicts = List[dict]


class TestUpdatingListOfDicts(unittest.TestCase):
    original = [
        {'title': 'Oh boy'},
        {'title': 'Oh girl'}]

    ext1 = {'meta': {'author': 'Myself'}}
    ext2 = dict()

    result1 = [{
        'title': 'Oh boy',
        'meta': {'author': 'Myself'}},
        {'title': 'Oh girl',
         'meta': {'author': 'Myself'}}]
    result2 = copy.deepcopy(original)

    def esteban(self, original: ListOfDicts, ext: dict) -> ListOfDicts:
        """"
        :author Esteban Sastre
        """
        return [(lambda x, y=item.copy(): (y.update(x), y))(ext)[1] for item in original]

    def yennick(self, original: ListOfDicts, ext: dict) -> ListOfDicts:
        """
        :author Yennick Schepers
        """
        return [dict(**item, **ext) for item in original]

    def test_result1_esteban(self):
        result = self.esteban(original=self.original, ext=self.ext1)
        self.assertEqual(self.result1, result)

    def test_result1_yennick(self):
        result = self.yennick(original=self.original, ext=self.ext1)
        self.assertEqual(self.result1, result)

    def test_result2_esteban(self):
        result = self.esteban(original=self.original, ext=self.ext2)
        self.assertEqual(self.result2, result)

    def test_result2_yennick(self):
        result = self.yennick(original=self.original, ext=self.ext2)
        self.assertEqual(self.result2, result)


if __name__ == '__main__':
    unittest.main()

Moving to Netlify

I threw out the S3 bucket and changed it with Netlify, and the experience is just awesome!

An S3 bucket is a very useful building block, and I believe Amazon Web Services has the right building blocks for a very complex IT system. However, an S3 bucket is not a recommend method of hosting a website. You need additional tools such as Amazon CloudFront, AWS Certificate Manager, Amazon Route 53, ... and maybe even Amazon CloudFormation!

How it works

The essentials for creating, updating and publishing a static website can be done from the commandline using the netlify-cli NodeJS Package.

Only a few steps are required:

  • You create a website using netlify create
  • Deploy it using the netlify deploy
  • Update the CNAME DNS-record for www to point to www.netlify.com
  • Optionally but recommended, redirect HTTP requests to the root DNS-record to the www DNS-record
  • Optionally but recommended, a free SSL/TLS certificate using Let's Encrypt

I think that Netlify's internal CDN is using a push (upload) method where it populates each POP with each deploy. I like this because it gives very fast feedback if an update succeeded or not!

CDN

The Netlify has build a CDN that operates around the globe.

A quick check using CEKDNS:

  • Virginia, United States - Amazon
  • New York, United States - Amazon
  • Toronto, Ontario, Canada, - Digital Ocean
  • San Antonio, Texas, United States - Rackspace
  • Sao Paulo, Brazil - Amazon
  • Frankfurt, Germany - Amazon
  • Dublin, Ireland - Amazon
  • Mumbai, India - Amazon
  • Tokyo, Japan - Amazon
  • Singapore - Amazon
  • Sydney, Australia - Amazon

Almost every continent pops up in this list, so this ensures low latency from almost every country in the world! It is also worth noting that the architecture looks cloud provider agnostic as their CDN covers Amazon, Digital Ocean and Rackspace. This keeps the option open to switch easily from one to another and if the momentum is big enough even install their own gear!

Edit 2018-01-22: CEKDNS seems to have shut down, removed dead link.

Hello Nikola!

So, finally! I have switched to a statical generated called Nikola. If you are familiar with the Python ecosystem you might have already heard of this project with more than 1000 stars on GitHub.

First choice

Nikola was however not my first choice. I actually started with Pelican.

To bad I had some issues with that:

  • Pelican 3.7.0 has issues with the localisation in the Quickstart script.

  • Pelican prefers to use Makefile, Windows does not. I discovered a blog that offers .bat alternatives.

  • Pelican's Makefile uses the s3cmd command line tool. There is still development on this, but I prefer the AWS CLI. This can be changed inside the Makefile.

    s3_upload: publish
        aws s3 sync --delete output/ s3://${S3_BUCKET}
    
  • The AGPL license creates a lot of legal uncertainties around 3rd-party files and compiled content.

Nikola

I went with the latest version of Nikola and I used the Zen theme. The Zen theme uses LESS, so a LESS compiler is required.

This theme uses a different structure for navigating. Compared to the original theme, an icon class has been added to the tuple format.

NAVIGATION_LINKS = {
    DEFAULT_LANG: (
        ('/index.html', 'Home', 'icon-home'),
        ('/archive.html', 'Archives', 'icon-folder-open-alt'),
        ('/rss.xml', 'RSS', 'icon-rss'),
    )
}

Hosting it

Nikola has an integrated system for deployments. I just tuned it for S3 bucket synchronisation et voila!

S3_BUCKET = 'xxxxxxxxxx'

DEPLOY_COMMANDS = {
    'default': [
        "aws s3 sync --delete output/ s3://{S3_BUCKET}".format(S3_BUCKET=S3_BUCKET),
    ]
}

There are some other important configuration details to keep in mind. These are not set to the correct settings when deploying your static website with CloudFront. For example, requesting http://www.example.com/ usually returns http://www.example.com/index.html. This behaviour defined in the web server configuration, and is often done by default. In CloudFront, this needs to be explicitly defined by setting DefaultRootObject: index.html in the CloudFront distribution configuration. This only works for the root path, not for subsequent paths. Nikola by default tries to generate Nice URLs, this behavior expects index.html to be the default object for a path. Nikola needs to be aware that CloudFormation does not support this by setting STRIP_INDEXES = False.

Creating a Amazon CloudFront distribution is fairly simple, just select the correct S3 bucket, set the Default Root Object to index.html and your good to go.

Or if you want do something fancy like I did using a AWS CloudFormation YAML template:

AWSTemplateFormatVersion: '2010-09-09'
Resources:
  blogDistribution:
    Type: AWS::CloudFront::Distribution
    Properties:
      DistributionConfig:
        Origins:
        - DomainName: bucket.s3.amazonaws.com
          Id: S3-bucket
          S3OriginConfig:
            OriginAccessIdentity: origin-access-identity/cloudfront/XXXXXXXXXXXXXX
        Enabled: 'true'
        Comment: 'bucket using CloudFormation'
        HttpVersion: http2
        DefaultRootObject: index.html
        Aliases:
        - xxxxxxx.com
        DefaultCacheBehavior:
          AllowedMethods:
          - GET
          - HEAD
          TargetOriginId: S3-bucket
          ForwardedValues:
            QueryString: 'false'
            Cookies:
              Forward: none
          ViewerProtocolPolicy: allow-all
        PriceClass: PriceClass_100
        ViewerCertificate:
          CloudFrontDefaultCertificate: 'true'

The CloudFormation User Guide contains more template snippets for CloudFront.

The only feature that seems to be missing is IPv6 support in CloudFront when using CloudFormation.

Latency

There are a lot of online tools that measure the latency to a certain website. The issue with these tools is that they often run in a datacenter and not from an end-user ISP. All tests are done using IPv4 ping or traceroute utilities from desktops and smartphones. These results should be consistent for other websites on CloudFront. YMMV.

ISP (Technology) (Country) AWS edge location Latency (ms)
EDPnet (VDSL2) (BE) Amsterdam (AMS1) 25
Luxembourg Online (ADSL2+) (LU) Frankfurt (FRA6) 33
Orange (LTE) (LU) Amsterdam (AMS1) 30
Post (Fiber) (LU) Frankfurt (FRA5) 6
Post (VDSL2) (LU) Frankfurt (FRA5) 31
Proximus (VDSL2) (BE) London (LHR50) 30
Tango (LTE) (LU) Frankfurt (FRA50) 41
Telenet (HSPA+) (BE) Amsterdam (AMS50) 343

Getting started with SDR using a RTL2832U

I purchased an ezcap USB 2.0 DVB-T stick a while ago. Why what's so special about a DVB-T receiver you say? Well device contains a Software Defined Radio (SDR), a Realtek RTL2832U, that is a radio interface that can be controlled using software. The drawback is that the this is only a receiver and that for some of you the radio spectrum might be to limited. But hey it's cheap!

Requirements

Install the following dependencies:

$ sudo apt-get install libusb-1.0-0-dev git cmake

⇒ Detailed installation instructions can be found on Michael Hirsch blog <http://blogs.bu.edu/mhirsch/2012/07/getting-started-with-rtl2832-ezcap-usb-sdr-receiver-in-ubuntu/>.

However when I was testing the SDR usb stick I came across an error:

$ sudo ./rtl_test -t

Found 1 device(s):
0:  Generic, RTL2832U, SN: 77771111153705700

Using device 0: Generic RTL2832U

Kernel driver is active, or device is claimed by second instance of librtlsdr.
In the first case, please either detach or blacklist the kernel module
(dvb_usb_rtl28xxu), or enable automatic detaching at compile time.

usb_claim_interface error -6
Failed to open rtlsdr device #0.

This can be solved by unloading the kernel module:

$ sudo modprobe -r dvb_usb_rtl28xxu

Now it should output something like this:

$ sudo ./rtl_test -t

Found 1 device(s):
0:  Generic, RTL2832U, SN: 77771111153705700

Using device 0: Generic RTL2832U
Found Fitipower FC0013 tuner
Supported gain values (23): -9.9 -7.3 -6.5 -6.3 -6.0 -5.8 -5.4 5.8 6.1 6.3 6.5 6.7 6.8 7.0 7.1 17.9 18.1 18.2 18.4 18.6 18.8 19.1 19.7
Sampling at 2048000 S/s.
No E4000 tuner found, aborting.

Use Gqrx to get a nice graphical interface overview of the signals received:

$ sudo apt-get install gqrx-sdr

$ sudo gqrx

Now you can switch to a local FM radio station, for example 101.4 MHz. Spend some time getting used with it. Later I will try to do some more hacks with this, like processing the raw samples and applying some digital signal processing techniques to it. When you're tired of hacking with it you can also use it for it's original purpose, watching TV!