Data Pointed Visualization, Statistics, and Art Fri, 07 Dec 2018 16:51:23 +0000 en hourly 1 The Skies Are Not Cloudy All Day Tue, 04 Dec 2018 17:48:51 +0000 Stephen Von Worley Editor’s Note: For maximum understandability, please first read this article’s contextual prerequisite: Where The Buffalo Roamed.

As a kid, McDonald’s was a monolith: immutable and, barring an act of God, indestructible. After St. Peter let you pass through the Pearly Gates, he’d bop down to the local Micky Dee’s, mow through the super-sized #7 meal, and grab a soft-serve vanilla cone for the road.

Now, if McDonald’s truly did last forever, and every year, corporate opened X% more? Why, that’s the recipe for Exponential McGrowth! From an overlook above town, buddies, beer, and I pondered… Would it end in the salty pleasures of McNirvana? Or a fast food adaptation of the gray goo, featuring a countryside gradually swamped by a layer of cast-off straws, fumbled McNuggets, and similarly durable pieces parts?

“lol dudes,” dad me interjects, “but McDonald’s do shut down, and more often than you might expect!” Witness this message, emailed to Yours Truly by a McInterested Party:

We are in Round Mountain Nevada. There was a McDonalds in Tonopah Nevada 45 miles away. Since it closed we now have to drive 158 miles to Battle Mountain to the nearest McDonalds. [...] I don’t think we are the furthest point but it must be close.

This tip begged for investigation, and now, as a byproduct of Thanksgiving slack…

Behold the interactive 2018 update to the Contiguous United States as Visualized by distance to the nearest Mcdonald’s:

Distance To McDonald's

Click above to launch an zoomable map of the domestic McField.

For the map connoisseurs: the color of each pixel represents the geodetic distance to the nearest U.S. McDonald’s, located by a recent scrape, rendered as Web Mercator tiles using Proximatic, and warped to an EPSG:2163 National Atlas Equal Area projection via OpenLayers and proj4 plus some rotation code to keep north pointing more-or-less up.

In our map, scattered McDropouts pepper the rural West. However, the McNetwork features high availability, and for almost every town that lost its Micky Dee’s, like John Day, Oregon or Mammoth Lakes, California, there’s a functional backup McTransmitter within an hour.

The outlier? Tonopah, Nevada, better suited to metaphors with nubby tires, wherein an irresistible McHankerin’ helps you into his rig, wheels you all the way Up S*** Creek, then busts a McAxle.

You see, Tonopah ranked amongst the most isolated McOutposts, and as its fryer blooped a final bubble, the surrounding McFrontier sprang forth and multiplied.

At its center, by my calculations, forty minutes of washboard south from the Extraterrestrial Highway and a few klicks to the civilian side of the greater Area 51 perimeter fence, you’ll find the Lower-48′s currently McFarthest Spot: a sandy swatch of Silver State sagebrush, just over 120 miles, as the crow flies, from the nearest McDonald’s! 1,2,3

Upon this revelation, my first order of business was to retrieve the McFarthest geocache, which, now beknownst to me, sat in a stale location. Had it survived eight years in the high desert, and, if so, did anyone ever visit? Vegas noted the possibility of Brownian Burners and set the latter odds at 50/50.

The cache and its contents – a space blanket, Happy Meal toy, and casino chip – were intact and undisturbed. In fact, but for my footprints and a thin layer of UV haze on the Tupperware, everything looked the same. Way out there, time unwinds at geological speed.

Disclaimer: For the record, the prior two paragraphs are 100% fictional: because the Feds get all cold-prickly about people hiding-and-seeking things in designated wildlife refuges. And little old me? Break the law? Intentionally? Good golly, never ever, I swear.

Happily, McFarthest Spot 2018 is on BLM land, the type of government property traditionally marked on maps as “anything goes.” So, next spring, after the Sierra thaws, I shall hitch up the mules, cross Tioga Pass to yonder Nevada, and emplace McFarthest Geocache v2.0.

As do all journeys through the desert, it will come with a story, so… Stay tuned!

Proximatic Is For Sale Wed, 31 Oct 2018 04:41:36 +0000 Stephen Von Worley The Northeast In Tweets

The “elevator pitch” version:
I’m selling all rights to Proximatic, my fast geospatial search system.
Proximatic is:
Fast: Remarkably high performance for many types of searches.
Accurate: Models the earth as an ellipsoid to minimize error.
Flexible: Supports a broad range of queries, from simple to complex.
Easy to use: Index and search data with less than ten lines of code.
Extensible: Clear, object-oriented design is easily enhanced and modified.
Applications include real-time analytics, ad targeting, content customization, data visualization, recommendation engines, real estate intelligence, routing, navigation, demographics, and more. Interested? Need more information? Contact me for a whitepaper!

Nine years ago, I published a little blurb about a magical place known as the McFarthest Spot, and thereafter, a slow drip of personal research on street grids, urban growth, and more. All involved the manipulation of data, mostly geographic, and, for each successive story, as a form of deliberate practice, I’d bite off slightly more than I’d previously been able to chew. Thousands of restaurant locations grew to millions of road segments to billions of Tweets, and by the end of it all, I’d authored a fairly respectable suite of geo-analytic software tools.

Around then, came a realization: out there, a universe of data described just about every spot in the world in remarkable detail. However, most of it was effectively unusable: by dint of obscurity, format, license restrictions, expense, or sheer size.

Motivated thusly, I would launch a geodata-as-a-service (GaaS) business: to unsilo all of that information, provide scalable access to it via Web API, and satisfy each query – how did people within 0.5km of (37.80766°N,122.26012°W) vote in the last Presidential election, or whatever – within a few milliseconds.

The key infrastructure? The search system, which would deftly pluck the necessary information from a sea of terabytes, many times each second, and accommodate a variety of searches, possibly on data sources that I hadn’t even discovered yet. The off-the-shelf options? Somewhat slow, too stiff, or simply inscrutable. To meet the design goals, I would have to engineer it myself.

So, I dropped my existing geo tools into a pot, stewed them for a few years in a rich broth of research, development, and deep thought, and voilà, Proximatic: my high-performance geospatial indexing and search system.

Proximatic combines a proprietary ellipsoidal geographic distance algorithm (about 6x faster and much more accurate than a typical spherical Haversine implementation), a heavily-optimized core search engine, and some novel optimizations on a recursive space-partitioning scheme to index and search large amounts of spatial data, in-memory, at blistering speed. An Amazon EC2 m5.2xlarge instance can run nearest neighbor searches on 1,000,000 randomly-distributed neighborhood-scale geocircles/georectangles in parallel at a combined rate of over 1,000,000 queries per second.

From top to bottom, Proximatic is designed with flexibility and ease-of-use in mind. Work with shapes – points, lines, geocircles, georectangles, polylines, and polygons – directly, or specify a “shaper” function to easily index any type of data element without modifying its class. Run nearest, farthest, “less than distance”, “within range”, “contains” and “contained-by” searches, or assemble them into complex boolean queries with optional non-spatial filters. Leverage the “ruler” abstraction to define/order searches on the average/minimum/maximum of multiple distances, tweak a distance measurement to include non-spatial criteria (like customer ratings), or adapt the search to tolerate a certain amount of movement within the index. And more! All in a lightweight Java 8+ compatible library with a generic, object-oriented core that can be adapted to search other spaces.

Ok, super-duper… right? Well, sure, except that things change, and, for the foreseeable future, the financial risk of bootstrapping a startup probably isn’t the best choice for my family. So…

Proximatic is for sale, lock, stock, and barrel!

The buyer gets all rights to the technology: the source code, algorithms, and inventions embodied within, plus the option to develop any of the three provisional patent applications I’ve filed, or let them expire unpublished and keep Proximatic a trade secret.

Use Proximatic to improve the performance of your servers. Analyze data real-time to make decisions or target content. Optimize the embedded navigation system in your self-driving car. Feed your machine learning stack. Enhance an existing product. Or create something new.

Whatever the plan, I’m happy to assist: hire me as part of the purchase to help make it happen.

Interested? Please email me at and I’ll send you a whitepaper with more details.

Crayon The Grids Wed, 15 Oct 2014 04:24:39 +0000 Stephen Von Worley

“Here!” exclaimed Jebediah as he nosed his schooner onto a fan of fertile loam. Come sundown, a makeshift corral encircled his livestock, and by Sabbath eve, the crown of a crude barn rose above the neighboring hummock.

Next spring, a steady procession of ships yielded a healthy crop of farmhouses. Wagon wheels burned a double track to the river landing, where itinerant capitalists soon repurposed a cluster of spartan shacks:

Dispell ill humours at Rodger’s Saloon!
Satisfy your homestead needs with Trusty Mercantile!
Every fifth horseshoe free at The Irony!

Forthwith straightened and graded, Main Street ran east to west, land astride platted into tidy rectangles. Soon, Washington and Jefferson joined in parallel, crossed at even intervals by perpendicular First, Second, and Third Streets.

A crystal in saturated solution, this grid grew: shooting southeast into open country along Telegraph Road, doglegging left around Miller’s Swamp, and crossing the river at Monroe Street Bridge, which lensed the opposite shore into a different orientation…

And so on, until some time ’round the Depression, when town planners discovered:

Oh my golly, curves! By George, a city block doesn’t necessarily need to be a rectangle, right? And, three way intersections, yeah, they’re pretty darn tootin’ okay…

Thereafter, new streets came, but in more pear-shaped and less grid-like arrangements than before.

Now, to Yours Truly, nirvana is a sunny day, strolling well-worn sidewalks past the wide-windowed storefronts of an old downtown. Some people might call me a Main Streetaholic – I’ve been known to scour maps for quaintness, and on a road trip, I’ll happily choose the Byzantine route just to experience the charms of a bygone Broadway.

And I thought I knew about every one between Ukiah and Scotts Valley.

Until, out of the blue, a friend informed me: “I’m moving to Graton!

Graton…? California? Uh… Why? Up came the Street View, and there, west of Santa Rosa, it was: a pocket-sized downtown decorated by a handful of adorable “Old West”-style buildings. OMFG. What other treasures had I missed?!

I made these maps to help me find out.

Above is San Francisco, and below, New York, Washington DC, Los Angeles, Tokyo, and five other interesting metros:

New York
New York
Los Angeles
Los Angeles
Washington DC
Washington DC

That’s every public street, colored by the predominant orientation of itself and its neighbors, thickened where the layout is most “grid-like” – to use an old-school woodworking metaphor, it’s as if we brushed some digital lacquer over the raw geographic transportation network data to make the grain pop.

For the detail-oriented, these are 100%-algorithmic images generated from MapZen’s Migurski-inspired October 2014 OpenStreetMap Metro Extracts as follows. First, we assign each linear street segment a compass-heading-based tone from a modified sinebow, where a 90 degree directional difference corresponds to a full color revolution, so that roads at right angles to each other have the same hue. Then, to render each point on the map, we use Proximatic, my custom high-performance k-NN engine, to calculate the length-weighted average of the colors assigned to the nearest 500 meters of street, keying render weight to the local degree of parallelism/orthogonality (derived in a similar mod-90° vector space), with rolloffs for outlying roads and territory.

Pan and zoom via Vladimir Agafonkin’s excellent Leaflet viewer, and click the “Acme” button for a more conventional map of the current view, kudos to Poskanzer.

Lots of stories in there: of cities waxed, towns waned, territory absorbed, and terrain negotiated (or, ala San Francisco, ignored completely).

Enjoy, and I’ll see you in the grids!

Tweet-Bokeh-O-Rama Thu, 09 May 2013 18:10:20 +0000 Stephen Von Worley Transmissions from a new project on the data frontier:

2013-04-29 23:11Z:

Three days wandering landscape of one-billion-plus geolocated Tweets. Overstimulated. Devising computer-aided system to generate density fields. Will calm noise, help to answer important questions:

How does global conversation vary over time and space?
Windiest place – spot with most characters per Tweet – in world?
Hella” tweeted more from downtown San Francisco or Telegraph Avenue in Berkeley?

2013-05-04 03:27Z:

Density estimation apparatus under development.

Inverse-squared approach consumes all available resources. Still running.
Kernel density looks blobby, oversmeared, gappy. Feels arbitrary, artificial.
Binning replaces fine data detail with cell structure. Like feeding Tweets through woodchipper, slopping pulp into buckets.

Frustrated. Radioed home base. Response drowned out by Retweet static. Will rest here.

2013-05-06 12:08Z:

Carrier pigeon arrived. Note suggests k-nearest neighbors algorithm. Wired together prototype with k-d tree, priority queues. Bushwhacked to vicinity of TwitterPlex, wiggled “nearest neighbor” knob from one to 10000:

Peek a boo, San Francisco, ICU! ( Alternate views: “photo booth”-style or high res @ 1, 10, 100, 1000, and 10000 neighbors.)

2013-05-08 04:20Z:

Refocused on New York, London, Los Angeles, Tokyo, Jakarta, Paris:

New York, New York
London, UK
Los Angeles, California
Tokyo, Japan
Jakarta, Indonesia
Paris, France

Conclusions: Twitter oatmeal is lumpy, people love to Tweet from airport. More as becomes apparent.

Trolls soon. Signing off for now, Steve.

Literally Billions Mon, 08 Apr 2013 17:12:01 +0000 Stephen Von Worley Last year, according to the U.S. Census Bureau, the world’s population topped seven billion, and now stands at approximately 7,077,490,000 as of noon Eastern, April 8, 2013.

Now, google “7 billion” or “all people” and you’ll see some pretty things, like Fathom’s Dencity and bmander’s North American DotMap, bobbing in an emotionless sea of Excel charts, peppered by strange tidbit upon offbeat factoid, ad infinitum. For example, did you know that every human being on the Earth, packed tightly, would fit inside a 900-meter-diameter sphere? Me neither.

However, had you wanted to experience the world’s population in its most primitive form – a single Web page featuring seven-billion-plus faces, dots, or whatever – as of yesterday, your search would’ve turned up bupkis.

A hole in the Internets! Which, upon discovery, as a capable netizen, I was obligated to fill.

Editor’s Note: I’ve since learned that the “7 Billion World” website has existed since October 2011. Check it out!

Click the circles below to view the full 17 football-fields-worth of fresh online real estate in browser-stretching 1200000p:

Some Of The Many Dots

That’s a dot for every man, woman, and child alive today, dynamically indexed to the nearest million per U.S. Census Bureau estimates. Arranged similarly, in real life, these seven gigapeople would cover the entire state of Rhode Island.

Within is lots of latent mischief, more than a few megatons of evil, and simultaneously, an even greater potential for creativity, kindness, and love.

Enjoy, and let’s be good to each other out there.

Petri Dish Wed, 30 Jan 2013 18:16:48 +0000 Stephen Von Worley Now, for your amusement, a pure HTML5+Javascript interpretation of computer science classic The Game of Life:

Why? Well…

Sixty-some years ago, the first computer operators punched binary machine code into paper tape. Programming was hard.

Soon, machine code begat assembly language, assembly language begat ALGOL, ALGOL begat CPL, then BCPL, B, C, and finally, C++. Thus equipped, in a single day, the ambitious geek could achieve what would’ve taken years of diddling ones and zeros. Life was good.

Around that time, the Internet made its debut, and then, OMFG…

The World Wide Web! Click a link, and your browser fetched the HTML from some far-flung server and rendered it up all nice and pretty. Like magic! Except that after a page loaded, it didn’t do much, but maybe animate a GIF or blink a tag.

Why so static? Because the dominant C/C++ paradigm excelled at producing giant, star-shaped pegs to the Web’s round, dialup-sized holes. Sensing the need, Sun gave us Java, which promised safe, bite-sized applets that you’d Write Once and Run Anywhere.

In practice, everything didn’t run quite everywhere. However, by 1999, both major Web browsers shipped with a high-quality Java virtual machine which would execute integer code nearly as fast as the equivalent C. By happenstance, I’d just founded an agency with a crackerjack engineering team, and we used Java’s “good parts” to create the first rich media banner ads, video-game style, by smashing bits, precomputing tables, and unrolling loops in ways that would make the Google Doodle blush.

Then, Macromedia’s Flash burst on scene, hypnotized designers with its whizzy tweens, and killed Java on the client. Overnight, the “interactive” Web degenerated into a stew of gratuitous transitions, dirt-slow ActionScript, and strange bugs that somehow survived each new Player release. Times were dark.

Now, a decade later, hallelujah… for the HTML5+Javascript duo heralds the second coming of efficient, general-purpose Web programming!

Hold on, not so fast… because while the latest desktop incarnations of Chrome/Safari/Firefox have Javascript performance pretty much nailed, what about lesser-endowed iPhones, Android eReaders, Internet-enabled toasters, and everything else?

To find out, I whipped together this Life implementation, optimized it for speed, and grafted on some benchmarking code. After iterating at full tilt for about ten seconds, it beams back timing data and details about the environment it ran in. With enough results, I’ll be able to better understand the universe of Javascript performance and tell you all about it.

So, please, drag your mouse | finger | sausage to become Conway, creator of tiny, cellular automata worlds. On as many different devices as you can manage.

And, of course, feel free to show it to your friends, so they can too.

Thanks, and ¡Viva la vida algorítmica!

Above Sea Level Wed, 14 Nov 2012 18:10:46 +0000 Stephen Von Worley New York City, Elevation And Population Density, 2010

Hurricane Sandy stripped the political haze from one of climate change’s core truths: when the oceans rise, they’ll submerge land, inundate homes, and destroy lives.

How high will the seas swell? And how fast? No one knows for certain. Science blasted us to the moon and built the Internet, so I put my faith in the climatologists when they say “later this century” and “by a lot”. However, the Devil might advocate that we don’t completely understand today, much less how to model what’ll happen decades from now. And he’d be right…

Perhaps, come July 2100, your great-grandson or daughter will vacation in family bliss alongside the same ocean we’ve always known.

Or instead, the icecap, lubricated by a parade of balmy summers, will slide off Greenland, float into the Gulf Stream, and melt like so many snowballs. Total sea level increase: twenty-plus feet.

And every ton of blithely-belched CO2 tilts the odds a hair closer to this Doomsday future.

To help us understand what’s at risk, I’ve mapped the vulnerability of major U.S. cities by combining 2010 Census and USGS data to show where people live and the height of the land underneath them.

Above, we see New York City, with dense residential neighborhoods colored by elevation: white at the mean tide line, fading to yellow at 12 feet, orange at 25 feet, and blue at 50 feet or more above sea level, with less-populated areas represented by the same scheme, darkened proportionally.

Below, nine more coastal metros, including Boston, Miami, Los Angeles, and San Francisco:

Miami, Florida
Tampa-St. Petersburg, Florida
Seattle, Washington
San Francisco-Oakland, California
Los Angeles, California
Boston, Massachusetts
San Diego, California
New Orleans, Louisiana
San Jose-Mountain View, California

Clearly, there’s significant growth potential in the hip-wader market.

Best of luck to anyone still bogged down in Sandy’s wake.

Dance, Factors, Dance Mon, 29 Oct 2012 07:21:40 +0000 Stephen Von Worley Upon first sight of Brent Yorgey’s brilliant Factorization Diagrams, I knew that I should take his lovely little dots and make them dance.

Our first tango would be inspired by the digital clock, with a separate diagram for each of hours, minutes, and seconds. For example, we’d portray 4:34:27 am like so:

4:34:27 on the Factor Clock

One night, plans discreetly set, we stole away, rendezvoused, and had our fun prancing about. But alas, what the encounter had in grace, it begged for drama. Quite simply, there wasn’t enough panache – dynamic range – because while hours go up to 23, and minutes and seconds to 59 (*), the diagrams don’t really start cooking until they reach the triple digits. And, per the choreography, zero maps to a motionless blank space, or, if represented by an alternate step, a stray left foot in a square clog, staggering unnaturally amongst the positive, round-slippered wholes.

So, instead, we practiced the best parts of our “clock” routine – the rhythmic movements and snappy transitions – and distilled the remainder to its essence.

Factorization Diagram: 729

And thus was born the Factor Conga: a promenade of primes, composites, and their constituents, arranged with an aesthetically-tuned variation of Yorgey’s rules, one per second.

Please play with the interactive, and do give it a minute to hit its stride. Or whirl there with a tap or two of the fast forward button.

This is what the dance clubs look like on Alpha Centauri.

For various reasons, the original animation topped out at 10,000. Use the unlimited version to enter the device-wedging, battery-burning beyond!

His And Hers Redux Wed, 19 Sep 2012 17:23:41 +0000 Stephen Von Worley For hire, instruct me to throw caution to the wind, and I’ll gleefully tap dance across the bleeding-edge Web for as long as you like.

Here, however, color me conservative. Sure, the haves drive shiny, freshly-updated, triple-headed, dodeca-core Mac Pros that can handle whatever I throw at them. But I also get lots of have nots, like Ma and Pa, who use the same ancient version of Windows SimpleBrowser their kid installed five years ago. For their sake, I’ve gotta step lightly with my tech. Because regaling them with a gray rectangle ain’t gonna cut it.

On the flip-side, a neckbeard visited this site a few seconds ago, and now curses into the green glow of his dialup-Lynx-VT100 rig:

No animated ASCII? WTF?

Well, sorry, dude, I’m here to help. But, to paraphrase Lincoln, I can only please X fraction of the people Y fraction of the time, where X = 1 and .99 > Y > .8, approximately, or vice versa.

Ergo, client-side, the 90% Rule:

If a new technology makes the experience better for nine tenths of the audience, use it.

Why do I tell you this? Because, after years of waiting patiently, at approximately 2:47pm on Thursday, September 13, 2012, Data Pointed’s user share of the last non-”modern” browsers – those which don’t support some semblance of HTML5, like Internet Explorer 8 – dwindled below the magic 10% threshold.

In other words, Inline SVG and Canvas, come to papa!

His And Hers Colors

To celebrate, I’ve completely overhauled my His And Hers Colors visualization, which now renders in Retina-Ready SVG using Mike Bostock’s fantabulous D3 library. Read the backstory, then fire up the dataviz, zoom or fullscreen to taste, and mouse/tap each circle to see its color name and gender preference. And, don’t miss the new buttons along the bottom, which sort left-to-right by hue, saturation, popularity, and more.

All in all, that’s 2,000 circles and radial gradients. They sing in Chrome, dance happily on IE9+ and Mac Safari, but trigger some serious Firefox rendering sludge, and spin the iPhone/iPad’s tiny embedded hamster wheel to terminal velocity. When the viz detects the “slow” browsers, it degrades gracefully, drops the fancy transitions, and displays at lower quality to improve the viewing experience.

Please give it a whirl, tweet me your feedback, and stay tuned for more HTML5 hijinks, coming soon.

Beefspace, Revisited Mon, 27 Aug 2012 17:25:39 +0000 Stephen Von Worley Editor’s Note: This article is a recycled concept from several years ago, updated with a new rendering scheme that made it worth sharing again.

A while back, I created two fanciful maps of a hypothetical, earth-penetrating, inverse-squared burger force, as broadcast by the 36,000-plus domestic restaurant locations of the eight largest U.S. fast food chains. Recently, I upgraded my Visualizationator’s speed by a few orders of magnitude, and as a test, I focused it on said beefspace, yielding several new renditions that you might find more delicious than before…

But first, a few changes. In a nod to the reality of hungry humans, we retooled the power-law metric to mirror the long-distance travels of bank notes, per Brockmann’s seminal “Where’s George” study. The colors now represent the three most influential chains at each point, weighted by cumulative force at a 4:2:1 ratio, where black is McDonald’s, red Burger King, yellow Wendy’s, magenta Jack In The Box, periwinkle Sonic, cream Dairy Queen, green Carl’s Jr., and cyan Hardee’s. Together, you can think of these tweaks as elegantly exposing the subtle contours of market dominance, then splattering them with the individual restaurant locations.

To wit, behold, the contiguous United States, viewed through the lens of the beefspace:

The Contiguous United States

Now, let’s zoom into East Texas, Mississippi, and Louisiana:

East Texas, Mississippi, Louisiana

That’s Dallas-Fort Worth at upper left and the Mississippi River delta lower right. Note the outlying clouds of also-ran franchise strength, dotted magenta on Houston’s south side and yellow at Jackson, Mississippi.

Next, please peruse the nebulous burger ecosystem of the southern Piedmont, running northeast from Atlanta, Georgia to Charlotte, North Carolina:

Atlanta To Charlotte

Zooming west and inward, we train our gaze on metropolitan Phoenix and the faint rectangular hints of its primary street grid:

Phoenix, Arizona

And finally, let’s roll the clock back to when all of this business began. Our singular universe, zeptoseconds before the Big Bang:

Beefspace, Singularity

Somewhere in there, as Sagan once said, billions and billions of burgers are waiting to be served!

Thanks to AggData for providing the geolocated store information that made this article possible.