So i was intrigued after writing my last post about TweepMe just exactly what someone would be letting themselves in for if they did sign up to the service, what accounts they could expect to see updates from etc.

Well, curiosity got the better of me so I threw some code together to scrape their pages and cross reference that with the relevant twitter page for the user and wandered away while it did it’s job.  When I returned it had spat out a file with just shy of 2,000 people (probably more signed up since then, that was just how many was showing at time of scrape).  For each person listed on TweepMe I retrieved:

  • Twitter account name
  • Name
  • Location
  • Website
  • Following Count
  • Followers Count
  • Update Count
  • Bio

I’ve had a quick spy through this file, and as i suspected it is very heavy on the following:

Product / services accounts – radio stations etc – aka you’re going to be spammed.

Massive following – just wanted more followers, that simple.

SEO and “Social Media guru” etc.

~250 accounts with < 10 followers looking to beef their numbers.

And worryingly, the top 5 accounts alone are responsible for 128,000 tweets since their inception, thats some serious time line flooding!

Having a look for one of my interests, a quick scan for ‘photo’ turned up only 87 tweeps mentioning that in their data, not a high ratio at all for my liking.

If you want a spy through the data yourself, you can grab a copy here:

I’m keen to hear anything interesting that anyone might turn up, so please do comment if you find anything interesting about the data.

~Shepy

UPDATE: After a tweet from @AlohaArleen, who presumably has something to do with the site, I’d like to share a couple of tweets about this post, just to make sure there is no misunderstanding about this data:

AlohaArleen : @Shepy You can’t use data extraction on the TweepMe site. The pages do not show all the users! Not even an acculmanation! FAIL! #tweepme

Shepy: @AlohaArleen How is it a fail, i scraped what was available, i never said it’s exhaustive, infact i said it wasnt. Defensive much? #tweepme

Shepy: @AlohaArleen Those accounts are reg’d, and so the data does give an accurate display of some of the accounts expected to follow. #tweepme