Big Data Problems

Like I mentioned in a post a few months back, there are a few problems with mining Twitter for locational data. Partly, the problems are due to a less than representative sample size. Related to this is an article on Wired today on big data and the ‘death’ of theory. Mark Graham, who is actually part of the floatingsheep collective, has this to say in it:

“I do get why people think that ‘big data’ will mean the end of theory, because you can now answer almost any conceivable question with large data sets and transactional data shadows, but irrespective of how big or complete our datasets are, they will always be selective and partial. We’re talking about a classic ‘if you have a hammer everything starts to look like a nail’ issue here.” 

Or in other words, in reference to the original floatingsheep map I commented on, and from the same Wired article:

not everyone tweets, and not everyone who tweets geotags their tweets. Even with the…contextual geotagging of tweets, that still leaves a sample of tweeters that isn’t absolutely everyone. It’s still a sample of “people with the capability and urge to tweet”.  

And so the issue of a small, unrepresentative sample size remains. Not quite the takeover of big data just yet.


Streetview Exhibition


An exhibition by Skyliner based on Google’s Street View. For city-lovers, map-makers and fans of architecture of Greater Manchester. Plus a secondary exhibition documenting the demolition of the former BBC building. 

The exhibition is at 2022 on Dale Street in Manchester’s Northern Quarter. The opening is tonight from 6pm with music from 8.30pm. Skyliner is a blog on the architectural history of Greater Manchester, and it’s well worth taking a look at. Posts on the Cromford Court apartments above the Arndale Centre, the stunning Albert Hall on Peter Street and the former Lewis’s department store on Market Street are particularly worth reading. Some fantastic photos of all of them too.