Can be whatever you like.<p>At the moment I'm working on something that parses eBay URLs and let's just say this isn't shaping up to be pretty. The following is a prototype which extracts search keywords from URLs:<p><pre><code> def getKW(url):
kw = ''
if getDomain(url) == 'motors.shop.ebay.com':
#url = "http://motors.shop.ebay.com/Parts-Accessories_Car-Truck-Parts-Accessories__f350-wheels-20_W0QQ_fxdZ1QQ_osacatZPartsQ2dAccessoriesQQ_trksidZm270Q2el1313&caz.html"
#f350-wheels-20
kw = urlparse(url)[2].split('__')[1].split('_W0QQ')[0]
if getDomain(url) == 'cgi.ebay.com':
#'http://cgi.ebay.com/ebaymotors/FACTORY-15-Mercedes-E320-300E-OEM-Chrome-Wheels-Rims_W0QQitemZ290277158739QQihZ019QQcategoryZ43955QQssPageNameZWDVWQQrdZ1QQcmdZViewItem&caz.html'
#FACTORY-15-Mercedes-E320-300E-OEM-Chrome-Wheels-Rims
kw = urlparse(url)[2].split('/ebaymotors/')[1].split('_W0QQ')[0]
if getDomain(url) == 'shop.ebay.com':
#url = 'http://shop.ebay.com/items/__mini-cooper-rims?_trkparms=72%3A543%7C66%3A2%7C65%3A12%7C39%3A1&caz.html'
#mini-cooper-rims
kw = urlparse(url)[2].split('/items/__')[1]
return kw if kw else False</code></pre>
It's said I once wrote a Perl script to obtain queue times in a call center that took a screen capture via VNC, then carved up the snapshot into tiles for each queue and then OCR'd each tile to get the queue time. The time was then shoved into memcache and the process repeated. I acknowledge nothing.
u may want to use regex to reduce / eliminate if checkings<p>i use newlisp, this is the code:<p>(set 'urls '(<p>"<a href="http://motors.shop.ebay.com/Parts-Accessories_Car-Truck-Parts-Accessories__f350-wheels-20_W0QQ_fxdZ1QQ_osacatZPartsQ2dAccessoriesQQ_trksidZm270Q2el1313&caz.html" rel="nofollow">http://motors.shop.ebay.com/Parts-Accessories_Car-Truck-Part...</a>"<p>"<a href="http://cgi.ebay.com/ebaymotors/FACTORY-15-Mercedes-E320-300E-OEM-Chrome-Wheels-Rims_W0QQitemZ290277158739QQihZ019QQcategoryZ43955QQssPageNameZWDVWQQrdZ1QQcmdZViewItem&caz.html" rel="nofollow">http://cgi.ebay.com/ebaymotors/FACTORY-15-Mercedes-E320-300E...</a>"<p>"<a href="http://shop.ebay.com/items/__mini-cooper-rims?_trkparms=72%3A543%7C66%3A2%7C65%3A12%7C39%3A1&caz.html" rel="nofollow">http://shop.ebay.com/items/__mini-cooper-rims?_trkparms=72%3...</a>"))<p>(define (getKW url)<p><pre><code> (find {([^/|^_]*)(_W0QQ|\?)} url 1) ;find using regex
$1) ;return the first matched string inside (bla*)
</code></pre>
(map println (map getKW urls))<p>;f350-wheels-20<p>;FACTORY-15-Mercedes-E320-300E-OEM-Chrome-Wheels-Rims<p>;mini-cooper-rims
I desperately needed to find a way to change the java.library.path at runtime, which is technically forbidden.<p>I finally stumbled upon this beautiful ugly hack (it opens up Sun's non-public class and hacks it via reflection):
<a href="http://forum.java.sun.com/thread.jspa?threadID=707176" rel="nofollow">http://forum.java.sun.com/thread.jspa?threadID=707176</a><p>It even works on Mac OS X.
I came up with a quite an ugly hack in Python for the EventScripts plugin (<a href="http://python.eventscripts.com" rel="nofollow">http://python.eventscripts.com</a>).<p>It was designed to allow you to thread a request for a web page in pure python and also timeout after a while (because threading is stupidly slow via ESP on game servers)..<p>Code: <a href="http://errant.pastebin.com/f3d492f2d" rel="nofollow">http://errant.pastebin.com/f3d492f2d</a><p>That is older code but all I can find atm. It has a tendency to crash things :P<p>The final code had a lot of time.sleep(0) code in it too to force the threads to try and grab the GIL. Ugh :P<p>(also for the record the hacked "kill" extension to threading.Thread I picked up from elsewhere :))