#!/usr/bin/env python
import collections
import random
import sys
filename = sys.argv[1]
know = len(sys.argv) > 2 and int(sys.argv[2]) or 3
num = 1000
alphabet = "abcdefghijklmnopqrstuvwxyz \n"
expected = collections.defaultdict(lambda: collections.defaultdict(int))
text = open(filename)
previous = ""
for line in text:
for letter in line:
# if letter not in alphabet: continue
expected[previous][letter] += 1
previous = (previous + letter)[-know:]
text.close()
def weighted_select(weights):
all = sum(weights.values())
index = random.randrange(all)
for letter, weight in weights.iteritems():
index -= weight
if index <= 0: return letter
print "Unexpected", index, weights
return letter
start = random.choice(expected.keys())
thoughts = [letter for letter in start]
for unused in xrange(num):
last = "".join(thoughts[-know:])
thoughts.append(weighted_select(expected[last]))
print "".join(thoughts)
Example:
./textual.py sample_text 4
writing deeper
it is a fail of you
my timid anxious queries bereft a perhaps it is only have unbeknow amazing has and i are
aimless arms
i done side in three
this curious that my lither mary
has anythin is as for me other mocking thought gain three
that still bring i remember
beform
that further being i remains me i are myself in that of timid anxious queries back trail may you is only as reality is made of hearts leaping in her
and want
but as that we happy returning is not word or recallinger serves ive unbeknow how unliking that furthere beauty a simply by its sublimitation has and bring i remains meager servation unrequited you are sunderstanding me to do known
if i find real as real as unto you most days for their bound
accept a perhaps it is under cling me shadows yet unhappier by
far word or soul to real as unto accurate too much discarded a bring lithe one you
what imprehend
my coverabundant lies back trail their ground
and their like apartake u
If anyone with the free time is clever enough to strip crappy poetry from online to use as a generator for this, I'd love to see the results.
Tuesday, December 1, 2009
Poetry generator
Continuing off the previous post, here's a nonsense (or poetry) generator. Takes a filename as an argument, optional second argument for the number of letters back to remember. Obviously this would need substantial work to be useful, but I was quite personally entertained by the return on minimal investment. Uncommenting the alphabet line leads to a different style.
The joy of collections
I recently discovered the collections library in python (http://docs.python.org/library/collections.html), which is a treasure trove of various handy data structures. As a fan of math, I had already found myself creating many of these classes and functionalities, and it is good to know others think the same way. These are implemented in C, which obviates any python scratchings I had been using.
In particular, here are a few cool tricks using the defaultdict type, which functions as a dictionary, but with a default value (saving the use of many
defaultdict takes a function to call as the initalizer for the default value of a new key, since int() evaluates to 0, and list() evaluates to [], we save ourselves some lambda or helper functions here and use those.
You can also nest the initializers, this code might be appropriate for use in a word generating program.
You can even be clever and allow for arbitrarily deep dictionaries, for those who don't like to initialize anything.
One caveat to note. While checking for the existence of an entry does not create it, referencing it will fill it up (you do not need to assign to it), making it unsuitable for try/excepts. It should be easy enough to catch though, as an IndexError clearly makes no sense.
In particular, here are a few cool tricks using the defaultdict type, which functions as a dictionary, but with a default value (saving the use of many
d[k] = d.get(k, ) ... lines and helper functions).defaultdict takes a function to call as the initalizer for the default value of a new key, since int() evaluates to 0, and list() evaluates to [], we save ourselves some lambda or helper functions here and use those.
import collections
d_counts = collections.defaultdict(int)
d_lists = collections.defaultdict(list)
sentence = "the quick brown fox jumps over the lazy dog"
for i, letter in enumerate(sentence):
d_counts[letter] += 1
d_lists[letter].append(i)
for letter, count in sorted(d_counts.iteritems()):
print "%s - %s" % (letter, count)
for letter, indicies in sorted(d_lists.iteritems()):
print "%s - %s" % (letter, indicies)
You can also nest the initializers, this code might be appropriate for use in a word generating program.
d_dict_int = collections.defaultdict(lambda: collections.defaultdict(int))
text = open("sample_text")
previous = ""
for line in text:
for letter in line:
d_dict_int[previous][letter] += 1
previous = (previous + letter)[-3:]
text.close()
# print the number of times any letter appeared after its (up to) 3 previous letters
for antecedents, letters in sorted(d_dict_int.iteritems()):
print antecedents
for letter, count in sorted(letters.iteritems()):
print " %s - %s" % (letter, count)
You can even be clever and allow for arbitrarily deep dictionaries, for those who don't like to initialize anything.
def inf_dict():
return collections.defaultdict(inf_dict)
d_infdict = inf_dict()
d_infdict[4][5]['foo'][6]['bar'] = 100
One caveat to note. While checking for the existence of an entry does not create it, referencing it will fill it up (you do not need to assign to it), making it unsuitable for try/excepts. It should be easy enough to catch though, as an IndexError clearly makes no sense.
d = collections.defaultdict(int)
print 4 in d # False
print d[4] # 0
print 10 in d # False
print 4 in d # True
Saturday, October 3, 2009
Getting 'New Tab' bookmark back in Google Chrome
In the early version of Chrome, it let you bookmark the 'New Tab' page, which I found very useful, since I was using it as my de facto home page.
With more recent versions of Chrome (not sure when, I am currently using 3.0.195.24), they removed that bookmark and disabled most forms of adding it to the bookmark tab. This was fairly irritating, and I would often find myself creating a new tab and closing the old one just to get to the page, which was both several keystrokes (or clicks) and had the unfortunate side effect of closing off any history on that old tab.
I searched around for a way to get it back, but didn't find anything useful, perhaps my terminology was off. In any case, a little guesswork showed that the new tab page is 'chrome://newtab'. You can add this url directly by opening the bookmark manager (right-click on the bookmark bar, go through the wrench/settings menu, or Ctrl-shift-B), adding a new page (right click on most things, or go to Organize>New Page...), and entering in 'chrome://newtab' (no quotes) for the location.
Happy browsing.
With more recent versions of Chrome (not sure when, I am currently using 3.0.195.24), they removed that bookmark and disabled most forms of adding it to the bookmark tab. This was fairly irritating, and I would often find myself creating a new tab and closing the old one just to get to the page, which was both several keystrokes (or clicks) and had the unfortunate side effect of closing off any history on that old tab.
I searched around for a way to get it back, but didn't find anything useful, perhaps my terminology was off. In any case, a little guesswork showed that the new tab page is 'chrome://newtab'. You can add this url directly by opening the bookmark manager (right-click on the bookmark bar, go through the wrench/settings menu, or Ctrl-shift-B), adding a new page (right click on most things, or go to Organize>New Page...), and entering in 'chrome://newtab' (no quotes) for the location.
Happy browsing.
Tuesday, May 12, 2009
creating a Debian Lenny VM in XenCenter 5.5
I recently had a fair amount of trouble trying get lenny installed (from template, not CD) on the 5.5 beta release of Xen. Attempted to use the user interface for this (XenCenter 5.1). Misguided and/or whimsical google searches included the following:
Of which only the last really helped. The long and short of it appears to be that you cannot set up the values properly from the UI, and you must do it via command line. Below is a small script to accomplish this. I tried pointing the install repository to a variety of other places, but debian.org was the only one I found which has the desired /netboot/xen/vmlinuz path.
There were some additional suggested commands from the CD-rom instructions. which I did not run. They would go before vm-start
Notes:
While it seems to want debian-500-i386-DVD-1.iso, the install goes too quickly (from ftp.debian.org) to actually be using the whole mound of GBs there.
From a Citrix representative positing:
http://forums.citrix.com/thread.jspa?messageID=1379090
I also found this useful (indirect thanks, Trip):
http://www.scribd.com/doc/4084854/XenServer-410-Guest-Installation-Manual
installing lenny image xencenter 5.5.0
lenny xencenter 5.5.0
lenny xencenter
install lenny VM xencenter
xencenter install a Linux guest operating system from a network installation server
lenny mirror xen install
"location of the guest operating system"
xen install URL
xen "install location" expected value
xen "install url" debian expected value
xen "install url" expected value
xen "install url"
lenny iso
xen xe set other config value
xen xe vm-install set other config value
other-config:install-repository was not set to an appropriate value
xen lenny image
dists/lenny/main/installer-i386/current/images/netboot/xen/vmlinuz
Of which only the last really helped. The long and short of it appears to be that you cannot set up the values properly from the UI, and you must do it via command line. Below is a small script to accomplish this. I tried pointing the install repository to a variety of other places, but debian.org was the only one I found which has the desired /netboot/xen/vmlinuz path.
#!/bin/sh
# NOTE: This script does not add any frills, including network capability
# Run this from dom0
#
template="Debian Lenny 5.0"
name="Lenny VM"
repo="ftp://ftp.debian.org/debian"
vm_uuid=`xe vm-install template="$template" new-name-label="$name"`
xe vm-param-set uuid=$vm_uuid other-config:install-repository="$repo"
xe vm-start uuid=$vm_uuid
There were some additional suggested commands from the CD-rom instructions. which I did not run. They would go before vm-start
# root_uuid=`xe vbd-list vm-uuid= params=uuid --minimal`
# xe vbd-param-set uuid=$root_uuid bootable=true
Notes:
While it seems to want debian-500-i386-DVD-1.iso, the install goes too quickly (from ftp.debian.org) to actually be using the whole mound of GBs there.
From a Citrix representative positing:
http://forums.citrix.com/thread.jspa?messageID=1379090
From the above repro (sic) after a default install you'll have a .6.26-2-686-bigmem kernel. Install the Linux guest utilities from xs-tools.iso and you'll get the 2.6.29-xs5.5.0.13 kernel that we test with and have verified to address several issues with the standard kernel. In 5.5 the Lenny support is 32 bit only.
I also found this useful (indirect thanks, Trip):
http://www.scribd.com/doc/4084854/XenServer-410-Guest-Installation-Manual
Friday, February 6, 2009
creating an instance of a python class given the class' name as a string
Last week I was attempting to make a scalable test framework in python which dynamically loads in python files which extend a known base class. I ran into problems trying to create instances of each test class, it seemed like new.classobj or instance should be able to satisfy my needs, but they only overwrote the class I was trying to load (they were in fact too dynamic/useful).
Was half-trying to figure this out for a day or two, finally found a solution at http://mail.python.org/pipermail/python-list/2004-April/257367.html
I don't care for it much, but it seems to get the job done, and makes sense on an implementation as well as a python-philosophy level.
The example given was
along with a valid note regarding the inherent kludge/betrayal/misconception in tying names to objects.
----
A much more satisfying solution for my particular problem appeared shortly after, while experimenting with what shows up in globals().
With the following directory structure/files:
Was half-trying to figure this out for a day or two, finally found a solution at http://mail.python.org/pipermail/python-list/2004-April/257367.html
I don't care for it much, but it seems to get the job done, and makes sense on an implementation as well as a python-philosophy level.
The example given was
>>> class A:
... def __init__(self, id):
... self.id = id
... def getID(self):
... print self.id
...
>>> oa1 = globals()['A'](12345)
>>> oa1
<__main__.a>
>>> oa1.getID()
12345
along with a valid note regarding the inherent kludge/betrayal/misconception in tying names to objects.
----
A much more satisfying solution for my particular problem appeared shortly after, while experimenting with what shows up in globals().
With the following directory structure/files:
./script.py # runner class
./test.py # template class for tests
./tests/ # test files directory
/test1.py # test class
# Other code loads directory contents and iterates
# Variables replaced with contents for clarity
# Note that "./" and "/" around "tests" are only there for completeness
>>> import sys
>>> sys.path.append("./tests/")
>>> t = __import__("test1")
>>> t_class = t.__dict__["test1"]
>>> t_inst = t_class()
Subscribe to:
Posts (Atom)