Friday, January 30, 2009

Squeezing Python / Django or Ruby on Rails into a Microsoft (ASP) Shop

A twelve step program to a better life

If you like it, please Retweet it! Thanks!

It's hard to ignore all the positive recommendations surrounding Python / Django and Ruby on Rails. But if you're at an all Microsoft Shop, how do you squeeze some of the newer "hip" technologies in the door? Whether you're the boss or just one of the team, it's a challenge! You have to deal with a potentially huge switching cost, training issues, support of multiple environments, selling your co-workers on the technology decisions, and a multitude of other issues.

Here's a strategy that I stumbled upon quite accidentally. It might not work for all situations, but looks promising so far for me.


1. Define a project that you want to get done. Ideally, it's relatively independent of the rest of your systems. This can be a reasonably complicated new project, as ours is.

2. Create a skunkworks team to work on the project. My team was two talented college freshmen (home for Christmas break) working part time, and myself.

3. Assemble the necessary resources. For us, it was a "server" (a 5-year old laptop), a work room (my basement), a couple of laptops and a network, a lot of Coke or other caffeine products (biased recommendation: ávitàe 45) and the phone number to the local pizza joint. Also, we had on hand several Python books. Total setup time: an hour to clean the basement and run a few cables. Unfortunately I had to "tear down" the environment twice for entertaining over the holidays!

4. Set an unrealistic deadline. Because the team would be quickly disbanding to return to The Ohio State University for their second quarter, this was a very firm deadline in our case. I knew I had their attention between December 17 and January 2 - roughly two weeks which included two holidays (totaling about 100 developer hours, since they were working part time), but no time after that.

5. Be ruthless on defining the objective for that time period. We focused entirely on the basic "guts" of the web application - the Python heavy lifting, and totally ignored certain areas that would get filled in later. We continually strategized by asking "what can we do in two weeks?", and eliminated entire sections that were unrealistic to complete. We had a lot to learn and build in two weeks, having only a little experience in Apache/Linux web serving, Python, and MySql, and no real world experience in Django. Our list looked something like this:

  • Get "portable" (laptop) web server running, including Apache, Python, MySql, and mod_python (but not Django).
  • Build a single, simple web page that performs the main function of the application: accept user input, store and retrieve data in the database, return a meaningful result to the user. Add individual pages only as time permits.
  • Focus on "end-to-end" web interaction for one trained (non-malicious) user, and in the case where everything goes right. Focus on the Python guts of this interaction. Focus on the logic, not the presentation.
  • Get a "web server" and simple application up and running in this period.
  • Utilize off-the-shelf open source components if they come close to meeting the needs of the project (even if we'll replace or enhance them later).

We avoided spending time on the following elements:
  • Graphics design, HTML and CSS (there are companies that do this a lot better than your average college freshmen can).
  • Django and Django's strengths: user administration, data models, database crud - record manipulation - Create, Retrieve, Update, Delete. Actually, we handled Create and Retrieve, but only in a minimal sense (knowing that Django would help tremendously here).
  • Google App Engine. (Tough call here; GAE offers a lot, and we utilized it and our portable web server for our prototype. But ultimately, we made the decision that it was better to have one less thing to learn, and better to have the server "in hand" for learning purposes. Cloud computing adds extra abstraction and increases the perceived learning curve.)
  • Error checking, protection from malicious users, security.
  • Reinventing any wheels.
We were, by design, ignoring some critical elements of the web application, but focused entirely on "what functionality can we deliver in two weeks" without getting bogged down in learning curves. We chose the items that played to the strengths of our team (writing "logic" code), and avoided areas where they weren't as strong or didn't have the benefit of many on-the-job years of experience (designing graphics, building robust supportable applications, security, etc.). I know the benefits of good design and designing with security up front, but we intentionally did not do that!

Our mission was, at the end of two weeks, to have a few things that worked, as opposed to twice as much stuff that only half-worked.

6. Spend very little time in the design phase - Just code! This two week effort may end up being the guts of your system, or it may end up being throw-away. Design decisions will become much more clear-cut after the prototype is built. But we never called it a prototype. And this was never considered to be throw-away code - I don't want to insult the developers or minimize their efforts! They do great work, and it was hugely valuable!

7. Prepare a "pitch" to the rest of the team. I knew that upon returning to work on January 5th, I'd have a challenge to transition the skunkworks project into a supportable product. I'd have to pitch it to the rest of the team. My strategy was to sell people on the project one-by-one, since it'd be controversial to bring a bunch of new technologies into the organization. If I "pitched it" to 10 people in a meeting, I figured mob-rule would take over, and they'd conclude that it only makes sense to do this using the technologies that we know - Microsoft technologies.

Since I'd be revealing it to co-workers at various times, trying to bring them on-board one at a time, I thought the best way to do this was a recorded format. I prepared a series of four videos (I called them "shideos", because they were really rough). That way I could explain the system and some of the design ideas in a prepared fashion, and wouldn't need to repeat it over and over as we involved more team members.

The videos / shideos focused entirely on the system, and not any of the potentially controversial technology decisions (Python / Linux).

8. Create a short list of items to complete, to turn the skunkworks project into a real project. Short is key here. The objective is to get other developers on board (the Microsoft guys), and not to overwhelm them. They now have a lot to learn - in addition to the new technologies (Python, Linux, MySQL), they also have to wade through the prototype code to see how the system works. But you don't want it to look so overwhelming that it seems easier to throw it away and start over on a Microsoft platform.

Migrating to new technologies is not all that daunting of a task if you can see it already working!

Our short list looked like:
  • Get it up and running on a real server
  • Fix this short list of bugs
The list did not appear overwhelming, because there was a working system to copy and modify. This helped smooth the transition. Python is not very intimidating if you can see someone else's working code! And setting up Apache, mod_python, and MySql is pretty easy if you have a working version on a laptop sitting next to you.

I almost apologetically presented this list to my coworkers - a team of professional developers. "Here's some code that two college freshmen threw together in a couple of weeks, let's see what we can do with it to make it production-quality."

9. Transfer ownership to the internal team, and continue to advance the project. By now, you should have management approval and even directive to get it done, if the project is worth doing. Acquire any necessary resources (our developers found that materials online were sufficient, but I wouldn't hesitate to buy books to smooth the transition).

We decided to continue to steer away from building things that Django would help us with, as we got the system running in a production-capable environment. Once the development team thoroughly understood the prototype software, and was comfortable making changes in Python, then we'd make the leap to Django. (Obviously it's better to design and build on Django from the start if you can. But our strategy worked pretty well, so that there wasn't one more technology added to the learning curve. Django's benefits are more obvious once you see the "hard way" to create a Python Web Application.)

10. Introduce Django into the environment. We set up Django on our web server, without impacting the running prototype. Then we copied the prototype files into the Django environment, refactoring the code to fit the environment. The cool thing about this approach is that the old prototype never went down, during the migration. We could see how the code was "supposed to work", while we migrated it to the new environment. We made the switch to Django one weekend when no one was looking!

11. Implement Django's strengths: The Model-View-Controller (or Model-Template-View) style, multiple environments (dev / test / production), better data CRUD, user administration, data abstraction, database model versioning, and more. Migrating the small prototype to Django made a lot of sense, because we had something to work with. We weren't staring at a blank slate trying to invent a data model for a new system; we could see one that was working!

There aren't a lot of Django books out there. But you can't go wrong with the book by the authors of Django! We also bought this one because of the discussion of Google App Engine in the appendix (ten pages).

12. Roll it into production. At this point, all the developers who are involved are excited about the project, and seem to appreciate the opportunity to learn something new, without being overwhelmed by the learning curve and unrealistic expectations. The process has improved their capability (and market value), and provided a fresh perspective on how to design great web applications.

We'll still have a legacy of Microsoft technologies for many years to come, but it's refreshing to now have an alternative platform!

If you like it, please Retweet it! Thanks!

Thursday, January 29, 2009

Automated Twitter Politeness in Python

Today's post is a mixed bag; I think you'll enjoy it. It combines a discussion of Encouraging Retweeting, with a report on some of my experience, and an example Python/Twitter Automation (automating politeness!) and a little bit of a treasure hunt with prizes!


Part 1: Encouraging Retweeting
If you've seen this before, skip down to part 2!

The idea behind encouraging Retweeting is that you are trying to encourage your followers to re-post your tweet to their followers (perhaps to drive traffic to your blog). You can make this very simple for your readers, simply by providing a link, as I have below. This is a real link! Give it a try!

Click here to Retweet Please!

In case you didn't try the link, here's an explanation: the link will open the visitor's Twitter home page and pre-load the status update field with the Retweet. It doesn't automatically update the visitor's status, but it simplifies the process tremendously. (So you can click it, and see how it works, without committing to the Retweet.)


Here are my steps for encouraging a Retweet.

1. Open a free account at Tweetburner.com. I like Tweetburner better than Tinyurl because you can track clicks (if you have an account!). For example, my last blog post got Retweeted more than 30 times, and I could see that it generated almost 500 hits in 50 hours. Those were pretty decent numbers for a crappy blogpost from a no-name blogger from the middle of nowhere, who is just learning Twitter! Frequent readers will recognize that I like to report my findings of this grand experiment and recognize that this is simply another report of my social experiments stats, and it's not intended to be bragging in any way!






2. Perform a couple of setup steps: Make sure you're signed into Tweetburner. (Can you tell, I've been burned by this?) Also write a ReTweet teaser headline. Something like "RT @amyiris Being automatically polite on Twitter using Python http://twurl.nl/xxxxxx". This is the text that OTHER people will send out as their tweets. Don't start this with an @ - some twitter users have those messages turned off. Start it with RT.

3. Open another browser tab or window, and in your browser address field, paste the following:

http://twitter.com/home?status=

and then at the end of that, paste in your headline that you created. Don't press enter yet, or try go to the page. Just use the browser address field as a copy-and-paste holding area.

4. Clean up the URL, by changing all the spaces to %20 and all the "at" signs (@) to %40. Hashtags (#) become %23. So your address bar should contain something like:

http://twitter.com/home?status=RT%20%40amyiris%20Being%20automatically%20polite%20 on%20Twitter%20using%20Python%20http://twurl.nl/xxxxxx

You'll change that ending "xxxxxx" part in a minute.

5. Create and Publish the Blog Post with placeholders for the Retweet links. My blogpost had this text at the top and bottom:


Click here to Retweet Please!


But when I first published the blogpost, I didn't have the links active, because I don't know the short URL yet. It's sort of a chicken-or-egg problem. I publish my blogpost to obtain the URL, that I am going to shorten. Then I go back and edit the blogpost to put the links in there that contain the short URL to the blogpost itself. That's why I do all the setup steps first - to minimize the amount of time that my blog post is published without links.

6. Copy the URL that will take me straight to the blog post. For me, it's something like http://blog.amyiris.com/2009/01/automated-twitter-politeness-in-python.html. Then paste it into Tweetburner. Make sure you are signed into Tweetburner, so you can track it! Then press the "Shorten it!" button. You'll get back a URL like this: http://twurl.nl/4835dg

7. Take that URL and modify your "xxxxxx" from step 4 to include the correct Tweetburner twurl.nl link information which you just got in step 6. So now you have a long URL with %20s and the short URL embedded in it. Copy this long URL to your clipboard, and then test it. You should be taken to your Twitter home page with the RT all filled out, and ready to update your status. This is how it will behave for your visitors.

8. Edit your blog post, and find all the places where you have the words:


Click here to Retweet Please!


Now make them links, by pasting the link code that you copied. Doesn't matter to me that this is a long URL, as long as it contains the short URL in the tweet.


Have I mentioned enough times in this blog post that you should
Click here to Retweet Please!



(Credits: This is only a slight variation of Yellow Candy's tip.)



Part 2: Automated Politeness
Stupid Python tricks!

I was inspired by this tweet. (Thanks, @ces614.)


My Thank-You to @ces614 was indeed created by a human - me, the person behind Amy Iris the bot (actually, Amy Iris likes to be called a "Conversational Interface", not a bot). But it's not so far fetched that one could create an automation to thank people for Retweets.

In fact, if you Retweet this for me, Amy Iris (the bot,er um, Conversational Interface) will politely try to send you an automated response (a direct message) thanking you. But wait, there's more! Just to make it interesting, she's going to give you a clue to a treasure hunt. (Make sure you are "following amyiris" before you Retweet, or you won't get the direct message - if I understand how Twitter works.) [Edit 2/2/2009: I have now turned off the monitor for RT's of this article. I haven't gotten any RT's on this for a few days. You are welcome to RT it, or even try to solve the puzzle through creative searching. Just let me know if you solve it, so you can be considered for the prize. I will live up to my agreement on the prizes, it'll just be a little different challenge (probably easier!) to find the answer online and qualify for a prize.]

As I mentioned, there would be prizes. Prizes go to the first people who figure out a secret message that Amy Iris has, and tweet the 140-character message. She'll give you hints at the message, in your Thank You message. I'll give a free T-shirt to the first ten people who I see that Tweet the secret message.

All standard contest rules apply (like "void where prohibited; decision of judges is final; I am not liable for failure of twitter or your computer, or my program" etc.) Hey, it's just for fun, so let's not get all legal!



OK, so one more time... here's how it works:

1. Amy Iris has a Python Program running [Edit: see note above, it's no longer running] (program listing is below, but you don't need to know ANYTHING about Python to participate. Ability to comprehend my program will not likely improve your capability to figure out the secret message.)

2. The Python program is looking for Retweets that link back here.

3. You Retweet for me, by clicking on this:

Click here to Retweet Please!

4. The Python program sends you a Thank-You via direct message, along with a clue to the secret message. [edit: program is turned off.]

5. You figure out the secret message, and post it in your Twitter timeline.

6. The first ten people who I spot posting the secret message into their Twitter timeline win a T-shirt. Please allow sufficient time for me to get some cool T-shirts printed.



Yes, this is a social experiment, but hey, I'm snowed in. May as well have some fun.

Ready to play along:


Click here to Retweet Please!



Here's the Automated Thank You program that is running [edit: was running]:


# This program is an example of an auto Thank You program.
# It scans for ReTweets, and when it finds one, it sends a thank you
# via Direct Message to the user who ReTweeted.
#
# The Thank-You includes a clue to the secretMessage.
#
# Built for Python 2.x (2.5 is running here),
# will not work on Python 3.0 without modification.
#

import twitter as twitterapi
import random, re, time, urllib2

thanked = {}

fil=open ("thanksguys.txt","r")
for t in fil.readlines():
thanked[t[:-1]]=True
fil.close()
print thanked

mypassword = open("pwd.txt","r").read() #hide it, of course
secretMessage = open("sm.txt","r").read() #you're trying to figure this out
baseurl="http://search.twitter.com/search?q="
query = "http://twurl.nl/4835dg"
restring1 = r'<div class="msg">.*?<a href="http://twitter.com/'
restring1+= r'(.*?)".*? class="msgtxt.*?">(.*?)</span>'
restring2 = r'<a href="/search.max_id=(.*?)&page=.*?&q=(.*?)">Older</a>'
thankyou = "Thanks for the RT! Here's your clue: The secret message %ss with %s"


assert len(secretMessage)==140 #clue: it's 140 characters!
assert secretMessage.count("*")==0 #no asterisks in it!


api = twitterapi.Api(username='amyiris', password=mypassword)


while True:
addon=""
try:
for i in range(2,10): #search through up to 8 search result pages
try:
r=urllib2.urlopen(baseurl+query+addon).read()
except:
break
retweeters=re.findall(restring1,r,re.DOTALL)


for g in retweeters:
if g[0] not in thanked: #sorry, one per person!
cointoss = random.randrange(2)
codedmsg = secretMessage[cointoss*70:(cointoss+1)*70]
while codedmsg.count("*")<56: #give em 10% of the message
rnd=random.randrange(70)
codedmsg = codedmsg[:rnd]+"*"+codedmsg[rnd+1:]
try:
api.PostDirectMessage(g[0],thankyou%(\
"startend"[cointoss*5:5+cointoss*3],codedmsg))
thanked[g[0]]=True
fil=open ("thanksguys.txt","a")
fil.write(g[0]+"\n")
fil.close()

print "sent"
except:
pass

f2=re.findall(restring2,r,re.DOTALL)

try:
addon="&max_id="+f2[0][0]+"&page="+str(i).strip()
except:
break
time.sleep(60)
except:
pass



Click here to Retweet Please!

Tuesday, January 27, 2009

C'mon Corporations! Meet the Semantic Web Halfway!

Click here to Retweet, please!

Today I got some validation that I am on the right track with some of my ideas. I made some comments nine days ago, imploring corporate web developers to make their information available via API's, and citing Best Buy specifically. Today I read that Best Buy did just that!

This is exactly what every web developer needs to be thinking about! It's the next phase of web development, and can keep all of us Web Developers gainfully employed for the coming years.

Think about the information that you have available that might be of interest to the general public. In Best Buy's case, it was Product Information (Product Number, Description, Price, Availability, Product Reviews), and Store Information (Location, Hours).

Note that this can be very fixed information (like price), or free-form information (like product review text).

Starbucks and Panera Bread (and every chain restaurant): I want your store locator accessible via an API. If I query by zip code, street name, city or state, I want to know the nearest store. I also want your store hours available via API.

Restaurants, if you accept reservations, I want an API to allow me to book online.

Delta Airlines: I want to know your flight schedule, along with availability and price. If flights are en route, let me query the current status (ETA). Put it in a machine-friendly API, not a human-friendly web interface. Make it easy for developers to mash it up!

Procter & Gamble: I want to be able to access frequently asked questions about your products. Where can I buy Febreze Air Fresheners? What are the customer reviews for Cover Girl Cosmetics? What are the replacement part numbers for my Braun shaver? How many carbs are in Crest White Strips? How many days in a row can I take Pepto Bismol? What are the usage instructions for Head and Shoulders?

For all publicly held corporations, how about some stock information and history? Where can I send my resume, and what jobs do you have available? Give me this information in an API!

GM: Put your new car product specs online in a JSON or XML format. What models do you offer for this year? Why should I need to comb thorough Kelly Blue Book to get this data in an organized fashion? How about a dealer locator API?

HP, provide an API for product lookups, and parts. I want to be able to query for a battery for my Compaq nc8430 computer, and the cartridge number for my DeskJet printer. But I don't want to use your human-oriented webpage. I want to use an API!

Even providing LINKS in an API would be valuable. Do you have PDFs online with instructions for a products? An API can provide a lookup for the PDFs.

Wal-Mart, Target, Sears - follow Best Buy's lead. Product information and Store Information to start with!

GE, I want product instructions online, queriable through an API. I want to be able to look up my oven's part number find out parts and waranty information. How about a distributor locator? Executive Information - who's who, with job history in an organized fashion. How about Press releases reachable and searchable via an API?

AT&T, I want stock information that's queriable. With all your buyouts and divestures, I want to be able to put in my stock purchase date, and know what the basis is. You have this information, help me find it with an API!

Radio Shack, can you provide me with a product equivalent guide? I have an A23 battery - what are the specs, and what is the Radio Shack equivalent? Provide it in an API so that a program can get to it!

Verizon, how about an API to allow me to send text messages to subscribers programmatically? Sure, you can require authentication or permission, or have other spam controls. How about a lookup of subscribers? Why not allow me to query on type of service - I have a buddy's phone number, but I don't want to send him a text message if he doesn't subscribe to that service (or if it'll cost him money to see it).

Or, if you are a standard blogger, SURELY you have something of interest to other people, that you can build an API for.

Making information machine-friendly via APIs will be the fulltime job of many web developers in the coming decade. Start now!

Click here to Retweet, please!

Tuesday, January 20, 2009

Twitter Automations with Python Scripts, part 2

Finding like-minded people in Twitter can be a challenge. Today's code demonstrates scraping a Twitter Directory site, called Twellow.com, looking for users who who are listed under their directory listing for Python.

Twellow isn't a complete directory (right now, they index only about 800K users, which might be around 10% complete), but it has a listing of 280 Twitter users in their Python section. It'd be nice to know who those users are.

One known issue with this code is that it grabs too many users. Since Twellow also shows recent tweets, it's possible that a Python user mentions another user in an @Reply, and this program picks up that user as well. (I consider that a feature.) Twellow reported, for example, that there are 280 Twitter users in its Python list, but this program retrieved 359 for me (which apparently picked up @replies as well).



turl="http://www.twellow.com/category_users/cat_id/181/page_num/" #cat_id=181 for Python people
seen=[]

for page in range (1,15):
r=urllib2.urlopen(turl+str(page).strip()).read()

friends=re.findall(r'<a href="http://www.twitter.com/(.*?)"',r,re.DOTALL)

fl=open('twitterspythontwellow.txt','a')
for fr in friends:
if fr not in seen:
fl.write(fr+'\n')
seen.append(fr)
fl.close()




Since my last code snippet wasn't very copy-paste friendly, I used the following tool today, to format my code for Blogger. Hopefully it will make the code more usable.

http://francois.schnell.free.fr/tools/BloggerPaste/BloggerPaste.html

Sorry about the last one!


Please leave comments with other tools. Know of a better directory than Twellow?


P.S. My social experiment yesterday was a flop. Still, the results may be worth mentioning. I suspected that I could put a half-ass blog post together, and throw it out onto a Wiki-style site (in this case, Wetpaint), and see the community rally to "finish it". Well, I was wrong! There were a couple of edits to it, but it wasn't nearly as effective as I thought it would be.

The model I was inspired by was the Twitter Fan site and the Twitter API wiki. The Twitter API is completely documented on the latter, and the former has a lot of other fun facts about Twitter. With my limited data, and unscientific trial, it appears that only about 0.2% of users will take the time to make an edit (on the first day, anyway). Makes me wonder how Wikipedia got to be so successful!


Please do me a favor. If you like this blog post, please Re-Tweet it. Thanks!

Sunday, January 18, 2009

Building the Semantic Web, Twitter style

Every Web Developer should learn a lesson from Twitter, and design a parallel set of web pages that are more machine readable (like the Twitter API). As we move toward the Semantic Web, this will become a business necessity.

The Semantic Web is widely considered to be the next major revolution in Internet technology. The concept is that the web currently has a wealth of information that is designed for humans to read, but there are efforts under way to make that same information meaningful to machines. Two specifications - RDF and Microformats are leading the way.

In both cases, HTML pages would be modified slightly to include "meta data" so that machines can make sense of the information that is being presented. For example, if you have contact information on a web page, why not wrap the phone number with an HTML tag that identifies it as a phone number to a web crawler reading your page. Not just A phone number, but YOUR phone number.

One more technique could provide value, though, that there's little discussion about, is the API approach.

Twitter has done a fantastic job building a very usable API that simplifies programming tasks to access its data. My previous post was an example of the hard way of scraping HTML to find the data that one might be looking for. It's not too difficult, but, as the first commenter pointed out, the API makes this much easier.

The issues with HTML scraping is that there's more setup time (the developer has to examine the HTML and decide how to parse it in advance), and there is less reliability (one change to the underlying HTML can break the scraping program).

A much better way to accomplish this task is through a well defined API. In a sense, this is what RDF and Microformats define - a predictable way to extract information from a web page.

Twitter could have used the same strategy as RDF and Microformats - they could have produed an API specification that said "If you want to programmatically interface with us, here is what you can rely on, as far as our HTML". And that would have achieved the objective.

Instead, they created a dead-simple API. You want Twitter User data or status updates, hit a certain web page with a certain parameter, and you'll get back your data in XML or JSON format (or ATOM or RSS). For example:

http://twitter.com/statuses/public_timeline.format
http://twitter.com/statuses/public_timeline.xml

This example link gives the public timeline in XML format.

The API documentation provides all you need to to get started accessing data via the API.

Astute web developers will examine this design pattern and apply it to their business. What data do you have on your website that your customers are trying to get access to? Your web pages allow humans to be able to read it, but the next step is to enable machines to get at it easier.

Sure, machines can scrape your HTML, but why not meet them half way. RDF and Microformats are a great step. But another simple step would be to provide an API.

If you're a developer, and you want to see the magnitude of this issue, simply go to any retailer's web site (Best Buy, Staples, Wal-Mart, etc.), and try to build an HTML scraper that grabs the product number, name, description and price of, say, every product that contains the word "television".

Let this be a call to every web developer in every major corporation! Build an API! You want to increase your online sales? Enable machines to be able to find your products or any other information that you have available. Enable machines to be able to search for your products, and buy your products. Look to the Twitter API as an example of simplicity!

Do NOT hide your shopping experience within a maze of session-dependent form posts. Works great for humans, but not for machines!

Tuesday, January 13, 2009

Twitter Automations with Python Scripts

Several readers have asked for code samples of how to utilize Python to automate some Twitter tasks. Below is a simple code sample to go to the Search page, and look for users who have recently mentioned "Python Programming" in their tweets.

This creates a file that contains a stripped down version of the tweet, removing any HTML. In the first 20 characters, the Twitter user's name is listed, so that you can quickly read through the file looking for people who may share a common interest.

A similar program can be created to parse this file and add users automatically.

Amy Iris uses simple scripts like this to perform some of her tasks of building a community of followers. I hope you find it useful (even though it's far from perfect). Feel free to use this code for good only, and consider putting a pause in the code so that you do not hammer the twitter servers!



#this program finds twitter users based on topic and dumps them to a file

import re, urllib2


baseurl="http://search.twitter.com/search?q="
query="python+programming&lang=en"
addon=""

for i in range(2,100):
.... r=urllib2.urlopen(baseurl+query+addon).read()

.... f=re.findall(r'<div class="msg">.*?<a href="http://twitter.com/(.*?)".*? class="msgtxt.*?">(.*?)</span>',r,re.DOTALL)

.... for g in f:
.... .... g0=g[1]

.... .... #strip HTML tags
.... .... while "<" in g0:
.... .... .... p1=g0.index("<")
.... .... .... p2=g0.index(">",p1)
.... .... .... g0=g0[:p1]+g0[p2+1:]

.... .... g2=g0

.... .... p=(g[0]+" "*20)[0:20] + g2 +"\n"
.... .... fil=open ("topictweetspython.txt","a")
.... .... fil.write(p)
.... .... fil.close()

.... f2=re.findall(r'<a href="/search.max_id=(.*?)&page=.*?&q=(.*?)">Older</a>',r,re.DOTALL)

.... addon="&max_id="+f2[0][0]+"&page="+str(i).strip()

Asymmetrical Communications and Computer Assisted Social Networking (CASN)

Nearly any male college freshman will tell you that his mother's need to hear from him far exceeds his need to communicate back. These asymmetrical communication needs pose some interesting challenges and opportunities.

I believe that bloggers and individuals looking to expand their outreach through Social Networks face some of the same challenges. These individuals want to build a one-to-one "personal" relationship each and every reader, while building an audience to be massive and beyond the realm of personal. Politicians and celebrities face similar challenges.

CASN is a term describing tools for Computer Assisted Social Networking (or Computer Aided Social Networking). These are tools that help automate your Social Networking Communications. There are a variety of tools available that allow you to communicate more efficiently, basically addressing these three areas: Reaching new followers, building relationships, and measuring the results.

My previous blog posts document my foray into quickly building a network of followers. Much of my experimentation to date has been in the "reaching new followers" arena, and was primarily conducted using Python scripts (although it could easily be done manually as well). Unannounced "Follows" walk the fine line of Spam, and I've tried to provide content that demonstrates that I am not interested in spamming people. It was not difficult to get 300 followers in five days simply by targeting a group of people who might have mutual interests, and then following them. Hopefully my activities and documented results have been helpful and not annoying.

Another challenge is to engage your followers. How do you engage a population of 300 people, so that you are providing value to each and every one on an individual, personal level? I find it interesting to ponder the implications of tools that would automate this process, and this will be the subject of some of my upcoming experiments.

First, the ethics must be addressed. Is it unethical for a college freshman to use a program to send a "canned" (pre-scripted) email to his mother every day, saying everything is OK? Imagine if there was a website that allowed you to select from a variety of pre-written emails to send to your mother, and each day, with the click of a button, the freshman efficiently sent his mother a short pre-written note selected from the library on the website. It's the thought that counts, and Mom gets a 250-word email to satisfy her needs, while Son isn't tied up writing to Mom, so he can go communicate with the girls on campus to satisfy his needs. Seems like a win-win!

This would be highly effective if the emails seemed personal, and not "canned". Is Son "tricking" mom? Or has he just found a way to satisfy her communication needs in an efficient manner?

Barack Obama, Britney Spears, and many others, are using social networking to strengthen the relationship with their followers. Do you feel "tricked" knowing that Barack actually had a media person sending out his tweets? Or that Britney is currently hiring a media manager to do her social networking for her? I think we all pretty much expect it, and would be surprised if Barack got on Twitter himself to send out Tweets!

So how can the average Twitter user or blogger do the same, if he or she wants to? Is there a way to automate your communications with your followers, in a way that builds a relationship, and doesn't destroy it? I plan to aim my upcoming experiments in this direction.

Some automations would be trivial - for instance, I have 300 followers, and I know the general location of many of them. If my automation program (bot?) notices that someone posts a tweet between the hours of 1 AM and 6 AM, sending a quick direct message that says "What are you doing still awake at this hour?" is a simple way to build that relationship. (Some of you have already gotten those messages from me, but honest, that was one of Amy Iris' human creators, not Amy Iris the bot. Still, would you know? Would you care? It's the thought that counts, and the human behind it is reaching out to you.)

Other automations could be quite sophisticated. Say a follower of mine posts a message that contains these three words "new blog post", and has an embedded link. It's trivial to create an automation that grabs the link, retrieves the blog post, scans the blog post looking for unque words in the post, performs a search on those words (maybe on Google or Google Video, or You Tube), and sends back a direct message note, saying "nice blog post. reminds me of this video." (along with a link) Of course, the more that the automation appears to be from a human, the more effective it would be. Maybe wait 5 minutes (the amount of time it might take a human to perform the same task). Maybe include a typo. Still, I question the ethics. Is this spammy? Is it deceitful? Or is it just being efficient?

My upcoming experiments will touch on this area - automating the personal one-to-one communications with your followers.

I will also put the link to the blog back onto my profile page (I removed it Sunday afternoon). I'll try to write blog posts in such a way that a quick glance doesn't reveal the nature of the Amy Iris project, and instead starts with generic information as this one did, continuing to give the illusion of Amy Iris as an individual blogger and not a project. But buried within the post will be more information about the project.

Some of my astute Twitter followers have guessed that Amy Iris is some sort of bot (or even artificial intelligence). It is true that Amy Iris is a project, and not a human, although clearly these blog posts and Tweets have been generated by a human on the Amy Iris team.

For the record, our project team is not claiming that Amy Iris is Artificial Intelligence. We believe that Amy Iris will advance A.I., but the official stance of the project team is that the relationship between Amy Iris and A.I. is that it's just her initials. However, you be the judge as more is revealed. The time of her unveiling is approaching fast!

Trust us... this changes EVERYTHING.....