<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	>

<channel>
	<title>simon.net.nz</title>
	<atom:link href="http://simon.net.nz/feed/" rel="self" type="application/rss+xml" />
	<link>http://simon.net.nz</link>
	<description>Simon Greenhill's Website</description>
	<pubDate>Mon, 02 Jun 2008 06:00:56 +0000</pubDate>
	<generator>http://wordpress.org/?v=2.7-bleeding</generator>
	<language>en</language>
			<item>
		<title>Talk: HBES 2008 - Pacific Settlement and Austronesian Languages</title>
		<link>http://simon.net.nz/articles/talk-hbes-2008-pacific-settlement-and-austronesian-languages/</link>
		<comments>http://simon.net.nz/articles/talk-hbes-2008-pacific-settlement-and-austronesian-languages/#comments</comments>
		<pubDate>Mon, 02 Jun 2008 06:00:56 +0000</pubDate>
		<dc:creator>Simon</dc:creator>
		
		<category><![CDATA[austronesian]]></category>

		<category><![CDATA[conferences]]></category>

		<category><![CDATA[phylogenetics]]></category>

		<category><![CDATA[talks]]></category>

		<guid isPermaLink="false">http://simon.net.nz/?p=48</guid>
		<description><![CDATA[I will be talking about Pacific settlement and Austronesian languages at the Human Behavior and Evolution Society meeting next week in Kyoto, Japan:
The settlement of the Pacific is one of the great chapters of human history. This region was settled by the Austronesian people during the last 10,000 years, eventually encompassing the region from Taiwan, [...]]]></description>
			<content:encoded><![CDATA[<p>I will be talking about <strong>Pacific settlement and Austronesian languages</strong> at the<a href="http://www.hbes.com/"> Human Behavior and Evolution Society </a>meeting <a href="http://beep.c.u-tokyo.ac.jp/%7Ehbes2008/index.htm">next week in Kyoto, Japan</a>:</p>
<blockquote><p>The settlement of the Pacific is one of the great chapters of human history. This region was settled by the Austronesian people during the last 10,000 years, eventually encompassing the region from Taiwan, to Hawaii, Easter Island (Rapanui), New Zealand, and Madagascar. Along the way, these people carried with them a distinctive &#8220;Lapita&#8221; culture and one of the largest language families in the world. There are two competing scenarios for this Austronesian expansion: either a rapid tree-like spread from Taiwan beginning around 6000 BP, or an expansion from a deeper Island South-East Asia origin around 17,000 BP. Over the last few years we have built a large comparative database of linguistic information from these languages and have begun using phylogenetic methods to explore Austronesian origins. The results of some phylogenetic analyses on 400 of these languages will be presented, along with what these results tell us about Pacific prehistory.</p></blockquote>
]]></content:encoded>
			<wfw:commentRss>http://simon.net.nz/articles/talk-hbes-2008-pacific-settlement-and-austronesian-languages/feed/</wfw:commentRss>
		</item>
		<item>
		<title>Parsing the Evolution of Language (Letter)</title>
		<link>http://simon.net.nz/articles/parsing-the-evolution-of-language-letter/</link>
		<comments>http://simon.net.nz/articles/parsing-the-evolution-of-language-letter/#comments</comments>
		<pubDate>Fri, 25 Apr 2008 00:19:06 +0000</pubDate>
		<dc:creator>Simon</dc:creator>
		
		<category><![CDATA[publications]]></category>

		<guid isPermaLink="false">http://simon.net.nz/?p=47</guid>
		<description><![CDATA[Atkinson, Q.D., Meade, A., Venditti, C., Greenhill, S.J., &#038; Pagel, M. Parsing the Evolution of Language (Letter). Science 320 (5875), 446a.]]></description>
			<content:encoded><![CDATA[<blockquote><p>In our Brevia, we used the example of Webster&#8217;s Dictionary&#8211;widely regarded as the inaugurating dictionary of American English&#8211;to illustrate how the desire for a distinct social identity can motivate language changes, such as spelling. Of course, some changes may have begun much earlier. We are not aware that anyone has measured how rapid or gradual these changes were by using the sorts of quantitative methods we have developed, but it would be informative to do so&#8230;.</p></blockquote>
]]></content:encoded>
			<wfw:commentRss>http://simon.net.nz/articles/parsing-the-evolution-of-language-letter/feed/</wfw:commentRss>
		</item>
		<item>
		<title>Talk: Horizontal Transmission and Cultural Phylogenies</title>
		<link>http://simon.net.nz/articles/talk-horizontal-transmission-and-cultural-phylogenies/</link>
		<comments>http://simon.net.nz/articles/talk-horizontal-transmission-and-cultural-phylogenies/#comments</comments>
		<pubDate>Sat, 09 Feb 2008 03:44:59 +0000</pubDate>
		<dc:creator>Simon</dc:creator>
		
		<category><![CDATA[conferences]]></category>

		<category><![CDATA[cultural evolution]]></category>

		<category><![CDATA[linguistics]]></category>

		<category><![CDATA[phylogenetics]]></category>

		<category><![CDATA[research]]></category>

		<category><![CDATA[talks]]></category>

		<guid isPermaLink="false">http://simon.net.nz/articles/talk-horizontal-transmission-and-cultural-phylogenies/</guid>
		<description><![CDATA[I&#8217;ll be talking at the NZ Phylogenetics Meeting this week on Horizontal transmission and cultural phylogenies:

Phylogenetic tree thinking is beginning to revolutionise studies of linguistic and cultural evolution. However, linguistic and cultural traits are easily transmitted horizontally (&#8221;borrowed&#8221;) between cultures. Indeed, well over 95% of the words in the Oxford English Dictionary aren&#8217;t English. A [...]]]></description>
			<content:encoded><![CDATA[<p>I&#8217;ll be talking at the <a href="http://www.math.canterbury.ac.nz/bio/whitianga08/">NZ Phylogenetics Meeting</a> this week on <i>Horizontal transmission and cultural phylogenies</i>:</p>
<blockquote><p>
Phylogenetic tree thinking is beginning to revolutionise studies of linguistic and cultural evolution. However, linguistic and cultural traits are easily transmitted horizontally (&#8221;borrowed&#8221;) between cultures. Indeed, well over 95% of the words in the Oxford English Dictionary aren&#8217;t English. A loud and persistent debate has centered around the issue of borrowing and whether it invalidates cultural phylogenies or not. Here, we use a natural model of linguistic evolution to simulate borrowing between languages. The results show that tree topologies constructed with Bayesian phylogenetic methods are relatively robust to the effects of realistic levels of borrowing. Inferences about time depth are slightly less robust.</p></blockquote>
]]></content:encoded>
			<wfw:commentRss>http://simon.net.nz/articles/talk-horizontal-transmission-and-cultural-phylogenies/feed/</wfw:commentRss>
		</item>
		<item>
		<title>Languages evolve in punctuational bursts</title>
		<link>http://simon.net.nz/articles/languages-evolve-in-punctuational-bursts/</link>
		<comments>http://simon.net.nz/articles/languages-evolve-in-punctuational-bursts/#comments</comments>
		<pubDate>Thu, 31 Jan 2008 20:33:22 +0000</pubDate>
		<dc:creator>Simon</dc:creator>
		
		<category><![CDATA[cultural evolution]]></category>

		<category><![CDATA[linguistics]]></category>

		<category><![CDATA[phylogenetics]]></category>

		<category><![CDATA[publications]]></category>

		<category><![CDATA[research]]></category>

		<guid isPermaLink="false">http://simon.net.nz/articles/languages-evolve-in-punctuational-bursts/</guid>
		<description><![CDATA[Atkinson, Q. D., Meade, A., Venditti, C., Greenhill, S. J., &#038; Pagel, M. (2008) Languages evolve in punctuational bursts. Science, 319, 588.]]></description>
			<content:encoded><![CDATA[<p>Linguists speculate that human languages often evolve in rapid or punctuational bursts, sometimes associated with their emergence from other languages, but this phenomenon has never been demonstrated.  We use vocabulary data from three of the world’s major language groups – Bantu, Indo-European and Austronesian – to show that 10-33% of the overall vocabulary differences among these languages arises from rapid bursts of change associated with language splitting events. Our findings identify a general tendency for increased rates of linguistic evolution in fledgling languages, perhaps arising from a linguistic ‘founder effect’ or a desire to establish a distinct social identity.</p>
]]></content:encoded>
			<wfw:commentRss>http://simon.net.nz/articles/languages-evolve-in-punctuational-bursts/feed/</wfw:commentRss>
		</item>
		<item>
		<title>The Pleasures and Perils of Darwinizing Culture (with phylogenies)</title>
		<link>http://simon.net.nz/articles/the-pleasures-and-perils-of-darwinizing-culture-with-phylogenies/</link>
		<comments>http://simon.net.nz/articles/the-pleasures-and-perils-of-darwinizing-culture-with-phylogenies/#comments</comments>
		<pubDate>Mon, 10 Dec 2007 23:20:56 +0000</pubDate>
		<dc:creator>Simon</dc:creator>
		
		<category><![CDATA[austronesian]]></category>

		<category><![CDATA[cultural evolution]]></category>

		<category><![CDATA[linguistics]]></category>

		<category><![CDATA[phylogenetics]]></category>

		<category><![CDATA[publications]]></category>

		<category><![CDATA[research]]></category>

		<guid isPermaLink="false">http://simon.net.nz/articles/the-pleasures-and-perils-of-darwinizing-culture-with-phylogenies/</guid>
		<description><![CDATA[Gray, R. D., Greenhill, S. J., &#038; Ross, R. M. (2007). The Pleasures and Perils of Darwinizing Culture (with phylogenies). Biological Theory, 2(4): 360-375.]]></description>
			<content:encoded><![CDATA[<p>Current debates about “Darwinizing culture” have typically focused on the validity of memetics.  In this paper we argue that meme-like inheritance is not a necessary requirement for descent with  modification. We suggest that an alternative and more productive way of Darwinizing culture can  be found in the application of phylogenetic methods.</p>
<p>We review recent work on cultural  phylogenetics and outline six fundamental questions that can be answered using the power and  precision of quantitative phylogenetic methods. However, cultural evolution, like biological  evolution, is often far from tree-like. We discuss the problems reticulate evolution can cause for  phylogenetic analyses and suggest ways in which these problems can be overcome.</p>
<p>Our solutions  involve a combination of new methods for the study of cultural evolution (network construction,  reconciliation analysis, and Bayesian mixture models), and the triangulation of different lines of  historical evidence. Throughout we emphasize that most debates about cultural phylogenies can  only be settled by empirical research rather than armchair speculation.</p>
]]></content:encoded>
			<wfw:commentRss>http://simon.net.nz/articles/the-pleasures-and-perils-of-darwinizing-culture-with-phylogenies/feed/</wfw:commentRss>
		</item>
		<item>
		<title>The Austronesian Basic Vocabulary Database</title>
		<link>http://simon.net.nz/articles/the-austronesian-basic-vocabulary-database/</link>
		<comments>http://simon.net.nz/articles/the-austronesian-basic-vocabulary-database/#comments</comments>
		<pubDate>Fri, 23 Nov 2007 10:12:38 +0000</pubDate>
		<dc:creator>Simon</dc:creator>
		
		<category><![CDATA[austronesian]]></category>

		<category><![CDATA[publications]]></category>

		<guid isPermaLink="false">http://simon.net.nz/articles/the-austronesian-basic-vocabulary-database/</guid>
		<description><![CDATA[Greenhill, S. J., Blust. R, &#038; Gray, R.D. (2005-2007) The Austronesian Basic Vocabulary Database.  http://language.psy.auckland.ac.nz/austronesian]]></description>
			<content:encoded><![CDATA[<p>Greenhill, S. J., Blust. R, &amp; Gray, R.D. (2005-2007)      <em>The Austronesian Basic Vocabulary Database.</em>      http://language.psy.auckland.ac.nz/austronesian</p>
]]></content:encoded>
			<wfw:commentRss>http://simon.net.nz/articles/the-austronesian-basic-vocabulary-database/feed/</wfw:commentRss>
		</item>
		<item>
		<title>COOL7 - Language trees and the des langues et base de données du vocabulaire austronésien</title>
		<link>http://simon.net.nz/articles/conference-cool7-language-trees-and-the-des-langues-et-base-de-donnees-du-vocabulaire-austronesien/</link>
		<comments>http://simon.net.nz/articles/conference-cool7-language-trees-and-the-des-langues-et-base-de-donnees-du-vocabulaire-austronesien/#comments</comments>
		<pubDate>Tue, 31 Jul 2007 07:10:57 +0000</pubDate>
		<dc:creator>Simon</dc:creator>
		
		<category><![CDATA[austronesian]]></category>

		<category><![CDATA[conferences]]></category>

		<category><![CDATA[linguistics]]></category>

		<category><![CDATA[phylogenetics]]></category>

		<category><![CDATA[talks]]></category>

		<guid isPermaLink="false">http://simon.net.nz/articles/conference-cool7-language-trees-and-the-des-langues-et-base-de-donnees-du-vocabulaire-austronesien/</guid>
		<description><![CDATA[Nombre de données linguistiques essentielles recueillies au fil des ans dorment dans des placards et ne sont pas accessibles à la communauté linguistique ou au public intéressé. Nous (Gray &#038; Greenhill) exploitons ces données pour reconstruire des arbres phylogénétiques des langues à l'aide des méthodes informatiques utilisées en biologie de l'évolution et pour vérifier ainsi les hypothèses émises sur le peuplement du Pacifique. Dans cette optique, nous avons informatisé une grande quantité de données lexicales et construit une base de données à grande échelle. Nous avons débuté avec le vocabulaire fourni par les listes de Swadesh rassemblées par Blust ces dernières vingt années ; notre base s'est ensuite enrichie grâce aux données de nombreux linguistes ou de publications. On peut consulter la base de données de vocabulaire austronésien (Austronesian Basic Vocabulary Database) à l'adresse suivante: http://language.psy.auckland.ac.nz. Actuellement, cette base concerne 481 langues, pour un total de plus de 100 000 entrées. Nous présenterons quelques-unes des techniques requises pour construire une telle base et nous évoquerons l'intéret qu'il y a à mettre à disposition sur internet ces données pour des recherches en collaboration. Pour terminer, nous exposerons nos projets d’extension et de consolidation de cette base de données, en invitant les chercheurs à nous fournir de nouvelles données. Au cours de notre communication, nous mentionnerons quelques résultats issus de nos dernières analyses.]]></description>
			<content:encoded><![CDATA[<p>Here&#8217;s the abstract of the talk I gave at the <a href="http://www.univ-nc.nc/COOL7.html">Seventh International Conference on Oceanic Linguistics (COOL7)</a>, in Noumea, New Caledonia, entitled <em>Language trees and the des langues et base de données du vocabulaire austronésien (Language trees and the Austronesian Basic Vocabulary Database)</em>:</p>
<blockquote><p>Nombre de données linguistiques essentielles recueillies au fil des ans dorment dans des placards et ne sont pas accessibles à la communauté linguistique ou au public intéressé. Nous (Gray &amp; Greenhill) exploitons ces données pour reconstruire des arbres phylogénétiques des langues à l&#8217;aide des méthodes informatiques utilisées en biologie de l&#8217;évolution et pour vérifier ainsi les hypothèses émises sur le peuplement du Pacifique. Dans cette optique, nous avons informatisé une grande quantité de données lexicales et construit une base de données à grande échelle. Nous avons débuté avec le vocabulaire fourni par les listes de Swadesh rassemblées par Blust ces dernières vingt années ; notre base s&#8217;est ensuite enrichie grâce aux données de nombreux linguistes ou de publications. On peut consulter la base de données de vocabulaire austronésien (Austronesian Basic Vocabulary Database) à l&#8217;adresse suivante: <a href="http://language.psy.auckland.ac.nz">http://language.psy.auckland.ac.nz</a>. Actuellement, cette base concerne 481 langues, pour un total de plus de 100 000 entrées. Nous présenterons quelques-unes des techniques requises pour construire une telle base et nous évoquerons l&#8217;intéret qu&#8217;il y a à mettre à disposition sur internet ces données pour des recherches en collaboration. Pour terminer, nous exposerons nos projets d’extension et de consolidation de cette base de données, en invitant les chercheurs à nous fournir de nouvelles données. Au cours de notre communication, nous mentionnerons quelques résultats issus de nos dernières analyses.</p></blockquote>
<p>..or in English:</p>
<blockquote><p> Much of the valuable linguistic data that has been collected over the years is languishing in filing cabinets and is not immediately available to linguists and interested members of the public. We (Gray &amp; Greenhill) are using this data to construct phylogenetic trees with computational methods adopted from evolutionary biology to test hypotheses about Pacific settlement. As part of this project we have &#8220;computerised&#8221; a large amount of lexical data, and constructed a large scale comparative database of this vocabulary. This data began with a collection of Swadesh lists collected by Blust over the last 20 years, and has been supplemented with lists from many other linguists and published resources. This Austronesian Basic Vocabulary Database is available on the internet at <a href="http://language.psy.auckland.ac.nz">http://language.psy.auckland.ac.nz</a>, and currently has word lists from 481 languages, for a total of over 100,000 entries. We shall describe some of the technologies required to build a repository such as this, and talk about the benefits of releasing data onto the internet for collaborative purposes. Finally, we will discuss our plans for expansion and consolidation of this database and make a special plea for more data. A few results from our recent analyses will be presented along the way.</p></blockquote>
]]></content:encoded>
			<wfw:commentRss>http://simon.net.nz/articles/conference-cool7-language-trees-and-the-des-langues-et-base-de-donnees-du-vocabulaire-austronesien/feed/</wfw:commentRss>
		</item>
		<item>
		<title>Simple webserver file alteration monitoring using integrit</title>
		<link>http://simon.net.nz/articles/simple-webserver-file-alteration-monitoring-using-integrit/</link>
		<comments>http://simon.net.nz/articles/simple-webserver-file-alteration-monitoring-using-integrit/#comments</comments>
		<pubDate>Wed, 06 Jun 2007 06:06:32 +0000</pubDate>
		<dc:creator>Simon</dc:creator>
		
		<category><![CDATA[code]]></category>

		<category><![CDATA[security]]></category>

		<guid isPermaLink="false">http://simon.net.nz/articles/simple-webserver-file-alteration-monitoring-using-integrit/</guid>
		<description><![CDATA[This shows us once again, that any software you run on your website needs to be kept up-to-date immediately, but what shocked me was that so many people out there running websites and are not watching them for file changes. They had no idea that their sites had been hacked until they went and looked for it.]]></description>
			<content:encoded><![CDATA[<h2>Intrusion detection?</h2>
<p>Over on <em><a href="http://mezzoblue.com/archives/2007/06/05/unsettling/index.php">Mezzoblue</a></em>, Dave Shea found out that his website had been compromised subtly. The attacker had exploited some (as yet unknown) security hole and quietly modified his website to link to the standard spam sites.</p>
<p>Within a few hours there were tens of posts from people who'd checked <em>their</em> websites and found similar  modifications that had been sitting their un-noticed, with people pointing the finger either at old Wordpress installations or a guesses that their hosting service had been compromised.</p>
<p>This shows us once again, that any software you run on your website needs to be kept up-to-date immediately, but what shocked me was that so many people out there running websites and are <strong>not watching them for file changes</strong>. They had <strong>no idea</strong> that their sites had been hacked until they went and looked for it.</p>
<p>So - in bold: <strong>Anyone running a website or webserver of any type needs to watch out for unexpected access and changes.</strong></p>
<p>The easiest way to do this is to use some intrusion detection software (IDS). This sounds complex, but it's actually quite easy to do. All these programs do is to monitor your files and warn you when they change. This would have immediately spotted this type of attack.</p>
<p>Because of this, I've decided to write up an easy guide to simple file alteration monitoring - here it is.</p>
<h2>Choose your weapon:</h2>
<p>There are plenty of intrusion detection/file modification apps out there - some of the better known ones include <a href="http://www.cs.tut.fi/~rammer/aide.html">AIDE</a>, <a href="http://www.la-samhna.de/samhain/">Samhain</a> and <a href="http://www.tripwire.com/">Tripwire</a>. These are all very cool, and highly powerful, but are also quite complex and hard to install, especially on cheap shared hosting.</p>
<p>Therefore, my weapon of choice, for the last few months has been a lightweight and fast application called <a href="http://integrit.sourceforge.net/">integrit</a>, so I'm going to tell you how to install it here.</p>
<p><strong>Before we start:</strong> Do make sure that you're not compromised right <em>now</em>, there's no point     running an IDS if you're already hacked. While you're at it, make sure everything's upgraded too.</p>
<h2>Step 1: Make a place to store integrit:</h2>
<p>Since you're on shared hosting, you can't install integrit properly into /usr, but you need to to put it somewhere anyway.</p>
<p>I decided to install it to a directory called "integrit" inside my home dir, so:</p>
<div class="igBar"><span id="lcode-1"><a href="#" onclick="javascript:showPlainTxt('code-1'); return false;">PLAIN TEXT</a></span></div>
<div class="syntax_hilite"><span class="langName">CODE:</span>
<div id="code-1">
<div class="code">
<ol>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;color:#3A6A8B;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">mkdir ~/integrit </div>
</li>
</ol>
</div>
</div>
</div>
<p></p>
<p>In the following commands, do remember to replace any mention of <em>~/integrit</em> with the directory you used.</p>
<h2>Step 2: Download and install integrit:</h2>
<p>The integrit webpage is at: <a href="http://integrit.sourceforge.net/">http://integrit.sourceforge.net/</a>, so go  there and get the latest version (currently 4.1), or you could cut'n' paste this:</p>
<div class="igBar"><span id="lcode-2"><a href="#" onclick="javascript:showPlainTxt('code-2'); return false;">PLAIN TEXT</a></span></div>
<div class="syntax_hilite"><span class="langName">CODE:</span>
<div id="code-2">
<div class="code">
<ol>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;color:#3A6A8B;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">cd ~/integrit</div>
</li>
<li style="font-weight: bold;color:#26536A;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">wget http:<span style="color:#FF9933; font-style:italic;">//optusnet.dl.sourceforge.net/sourceforge/integrit/integrit-4.1.tar.gz </span></div>
</li>
</ol>
</div>
</div>
</div>
<p></p>
<p>Once the integrit archive file is in your ~/integrit directory, we need to decompress it and install it:</p>
<div class="igBar"><span id="lcode-3"><a href="#" onclick="javascript:showPlainTxt('code-3'); return false;">PLAIN TEXT</a></span></div>
<div class="syntax_hilite"><span class="langName">CODE:</span>
<div id="code-3">
<div class="code">
<ol>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;color:#3A6A8B;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">cd ~/integrit</div>
</li>
<li style="font-weight: bold;color:#26536A;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">tar -zxvf integrit-<span style="color:#800000;color:#800000;">4</span>.<span style="color:#800000;color:#800000;">1</span>.<span style="">tar</span>.<span style="">gz</span></div>
</li>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;color:#3A6A8B;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">&nbsp;</div>
</li>
<li style="font-weight: bold;color:#26536A;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">cd integrit-<span style="color:#800000;color:#800000;">4</span>.<span style="color:#800000;color:#800000;">1</span></div>
</li>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;color:#3A6A8B;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">&nbsp;</div>
</li>
<li style="font-weight: bold;color:#26536A;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">./configure</div>
</li>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;color:#3A6A8B;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">make </div>
</li>
</ol>
</div>
</div>
</div>
<p></p>
<p>Next we need to put the integrit binary somewhere where we can get it, here I've just dumped it into the ~/integrit<br />
directory, but you could put it in ~/bin or something nicer if you want:</p>
<div class="igBar"><span id="lcode-4"><a href="#" onclick="javascript:showPlainTxt('code-4'); return false;">PLAIN TEXT</a></span></div>
<div class="syntax_hilite"><span class="langName">CODE:</span>
<div id="code-4">
<div class="code">
<ol>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;color:#3A6A8B;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">cp integrit ~/integrit </div>
</li>
</ol>
</div>
</div>
</div>
<p></p>
<h2>Step 3: Set up integrit:</h2>
<p>Look in the integrit-4.1/examples directory and make a config file from the example.</p>
<p>You need three things at least:</p>
<ol>
<li>The <em>known</em> file database - this is where the integrit database is stored.</li>
<li>The <em>current</em> file database - this is where integrit stores the modified info.</li>
<li>A <em>root</em> directory to monitor - this is the full path to the directory we want to watch.</li>
</ol>
<p>We can also tell integrit to <em>ignore</em> directories, by listing with an exclamation mark at the start of the line. We want to ignore the ~/integrit directory, and on dreamhost, we'll need to ignore the webserver log directory (because it changes a lot, and parts of it our user can't access which will cause errors).</p>
<p>All in all, it'll look something like this:</p>
<div class="igBar"><span id="lcode-5"><a href="#" onclick="javascript:showPlainTxt('code-5'); return false;">PLAIN TEXT</a></span></div>
<div class="syntax_hilite"><span class="langName">CODE:</span>
<div id="code-5">
<div class="code">
<ol>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;color:#3A6A8B;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;"># database locations <span style="color:#006600; font-weight:bold;">&#40;</span>FULL PATHS!<span style="color:#006600; font-weight:bold;">&#41;</span></div>
</li>
<li style="font-weight: bold;color:#26536A;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">known=/home/simon/integrit/src_known.<span style="">cdb</span></div>
</li>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;color:#3A6A8B;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">current=/home/simon/integrit/src_current.<span style="">cdb</span></div>
</li>
<li style="font-weight: bold;color:#26536A;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">&nbsp;</div>
</li>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;color:#3A6A8B;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;"># What do we want to check <span style="color:#006600; font-weight:bold;">&#40;</span>no trailing slash!<span style="color:#006600; font-weight:bold;">&#41;</span></div>
</li>
<li style="font-weight: bold;color:#26536A;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">root=/home/simon</div>
</li>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;color:#3A6A8B;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">&nbsp;</div>
</li>
<li style="font-weight: bold;color:#26536A;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;"># ignore the integrit dir:</div>
</li>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;color:#3A6A8B;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">!/home/simon/integrit</div>
</li>
<li style="font-weight: bold;color:#26536A;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">&nbsp;</div>
</li>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;color:#3A6A8B;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;"># ignore the webserver logs dir:</div>
</li>
<li style="font-weight: bold;color:#26536A;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">!/home/simon/logs</div>
</li>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;color:#3A6A8B;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">&nbsp;</div>
</li>
<li style="font-weight: bold;color:#26536A;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;"># oh, and the bash logfile</div>
</li>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;color:#3A6A8B;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">!/home/simon/.<span style="">bash_history</span> </div>
</li>
</ol>
</div>
</div>
</div>
<p></p>
<p><em>Note:</em> You'll need to change "simon" to your user name, and "integrit" to where you installed integrit to in step 2.</p>
<h2>Step 4: Create integrit database</h2>
<p>We need to get integrit to store a list of the current files, and their vital statistics, so run this command:</p>
<div class="igBar"><span id="lcode-6"><a href="#" onclick="javascript:showPlainTxt('code-6'); return false;">PLAIN TEXT</a></span></div>
<div class="syntax_hilite"><span class="langName">CODE:</span>
<div id="code-6">
<div class="code">
<ol>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;color:#3A6A8B;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">~/integrit/integrit -C home.<span style="">conf</span> -u </div>
</li>
</ol>
</div>
</div>
</div>
<p></p>
<p>...where "home.conf" is the integrit configuration file that you generated in step 3.</p>
<p>If all goes well, you'll see something like this:</p>
<div class="igBar"><span id="lcode-7"><a href="#" onclick="javascript:showPlainTxt('code-7'); return false;">PLAIN TEXT</a></span></div>
<div class="syntax_hilite"><span class="langName">CODE:</span>
<div id="code-7">
<div class="code">
<ol>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;color:#3A6A8B;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">integrit: ---- integrit, version <span style="color:#800000;color:#800000;">4</span>.<span style="color:#800000;color:#800000;">1</span> -----------------</div>
</li>
<li style="font-weight: bold;color:#26536A;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">integrit:&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; output : human-readable</div>
</li>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;color:#3A6A8B;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">integrit:&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;conf file : home.<span style="">conf</span></div>
</li>
<li style="font-weight: bold;color:#26536A;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">integrit:&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; known db : /home/simon/integrit/src_known.<span style="">cdb</span></div>
</li>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;color:#3A6A8B;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">integrit:&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; current db : /home/simon/integrit/src_current.<span style="">cdb</span></div>
</li>
<li style="font-weight: bold;color:#26536A;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">integrit:&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; root : /home/simon</div>
</li>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;color:#3A6A8B;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">integrit:&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; do check : no</div>
</li>
<li style="font-weight: bold;color:#26536A;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">integrit:&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;do update : yes</div>
</li>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;color:#3A6A8B;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">integrit: current-state db RMD160 --------------</div>
</li>
<li style="font-weight: bold;color:#26536A;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">integrit: 3d6b135343a5031d357b5bb2d7d7dc39c7ab5646&nbsp; /home/simon8/integrit/src_current.<span style="">cdb</span> </div>
</li>
</ol>
</div>
</div>
</div>
<p></p>
<p>Once that's done, copy the newly created database to the known database location:</p>
<div class="igBar"><span id="lcode-8"><a href="#" onclick="javascript:showPlainTxt('code-8'); return false;">PLAIN TEXT</a></span></div>
<div class="syntax_hilite"><span class="langName">CODE:</span>
<div id="code-8">
<div class="code">
<ol>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;color:#3A6A8B;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">cp ~/integrit/src_current.<span style="">cdb</span> ~/integrit/src_known.<span style="">cdb</span> </div>
</li>
</ol>
</div>
</div>
</div>
<p></p>
<h2>Step 5: Test that integrit's working:</h2>
<p>So lets make sure that integrit's working properly. To do this, we can add an empty file somewhere and see if integrit spots it.</p>
<div class="igBar"><span id="lcode-9"><a href="#" onclick="javascript:showPlainTxt('code-9'); return false;">PLAIN TEXT</a></span></div>
<div class="syntax_hilite"><span class="langName">CODE:</span>
<div id="code-9">
<div class="code">
<ol>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;color:#3A6A8B;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">cd ~</div>
</li>
<li style="font-weight: bold;color:#26536A;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">touch foo </div>
</li>
</ol>
</div>
</div>
</div>
<p></p>
<p>Now we can run integrit:</p>
<div class="igBar"><span id="lcode-10"><a href="#" onclick="javascript:showPlainTxt('code-10'); return false;">PLAIN TEXT</a></span></div>
<div class="syntax_hilite"><span class="langName">CODE:</span>
<div id="code-10">
<div class="code">
<ol>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;color:#3A6A8B;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">~/integrit/integrit -C ~/integrit/home.<span style="">conf</span>&nbsp; -c </div>
</li>
</ol>
</div>
</div>
</div>
<p></p>
<p>If all goes well, you'll see something like this:</p>
<div class="igBar"><span id="lcode-11"><a href="#" onclick="javascript:showPlainTxt('code-11'); return false;">PLAIN TEXT</a></span></div>
<div class="syntax_hilite"><span class="langName">CODE:</span>
<div id="code-11">
<div class="code">
<ol>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;color:#3A6A8B;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">integrit: ---- integrit, version <span style="color:#800000;color:#800000;">4</span>.<span style="color:#800000;color:#800000;">1</span> -----------------</div>
</li>
<li style="font-weight: bold;color:#26536A;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">integrit:&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; output : human-readable</div>
</li>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;color:#3A6A8B;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">integrit:&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;conf file : /home/simon8/integrit/home.<span style="">conf</span></div>
</li>
<li style="font-weight: bold;color:#26536A;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">integrit:&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; known db : /home/simon8/integrit/src_known.<span style="">cdb</span></div>
</li>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;color:#3A6A8B;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">integrit:&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; current db : /home/simon8/integrit/src_current.<span style="">cdb</span></div>
</li>
<li style="font-weight: bold;color:#26536A;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">integrit:&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; root : /home/simon8</div>
</li>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;color:#3A6A8B;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">integrit:&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; do check : yes</div>
</li>
<li style="font-weight: bold;color:#26536A;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">integrit:&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;do update : no</div>
</li>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;color:#3A6A8B;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">new:&nbsp; &nbsp; &nbsp;/home/simon8/foo&nbsp; &nbsp;p<span style="color:#006600; font-weight:bold;">&#40;</span><span style="color:#800000;color:#800000;">664</span><span style="color:#006600; font-weight:bold;">&#41;</span> t<span style="color:#006600; font-weight:bold;">&#40;</span><span style="color:#800000;color:#800000;">100000</span><span style="color:#006600; font-weight:bold;">&#41;</span> u<span style="color:#006600; font-weight:bold;">&#40;</span><span style="color:#800000;color:#800000;">767504</span><span style="color:#006600; font-weight:bold;">&#41;</span> g<span style="color:#006600; font-weight:bold;">&#40;</span><span style="color:#800000;color:#800000;">203016</span><span style="color:#006600; font-weight:bold;">&#41;</span> z<span style="color:#006600; font-weight:bold;">&#40;</span><span style="color:#800000;color:#800000;">0</span><span style="color:#006600; font-weight:bold;">&#41;</span> m<span style="color:#006600; font-weight:bold;">&#40;</span><span style="color:#800000;color:#800000;">20070605</span>-<span style="color:#800000;color:#800000;">162156</span><span style="color:#006600; font-weight:bold;">&#41;</span></div>
</li>
<li style="font-weight: bold;color:#26536A;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">new:&nbsp; &nbsp; &nbsp;/home/simon8/foo&nbsp; &nbsp;s<span style="color:#006600; font-weight:bold;">&#40;</span>9c1185a5c5e9fc54612808977ee8f548b2258d31<span style="color:#006600; font-weight:bold;">&#41;</span></div>
</li>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;color:#3A6A8B;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">integrit: not doing update, so no check for missing files </div>
</li>
</ol>
</div>
</div>
</div>
<p></p>
<p>Notice how integrit's spotted the <em>foo</em> file that's not in the database? If one of the files has changed, you'll get much the same output, with "changed:" instead of "new:". So - remove the dummy file:</p>
<div class="igBar"><span id="lcode-12"><a href="#" onclick="javascript:showPlainTxt('code-12'); return false;">PLAIN TEXT</a></span></div>
<div class="syntax_hilite"><span class="langName">CODE:</span>
<div id="code-12">
<div class="code">
<ol>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;color:#3A6A8B;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">rm ~/foo </div>
</li>
</ol>
</div>
</div>
</div>
<p></p>
<h2>Step 7: Get integrit to run daily:</h2>
<p>Now, we want to set up a cron job, so that integrit is run automatically for us. First of all, we should make a quick little shell script to run integrit and email the results to us:</p>
<div class="igBar"><span id="lcode-13"><a href="#" onclick="javascript:showPlainTxt('code-13'); return false;">PLAIN TEXT</a></span></div>
<div class="syntax_hilite"><span class="langName">CODE:</span>
<div id="code-13">
<div class="code">
<ol>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;color:#3A6A8B;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">#!/bin/bash</div>
</li>
<li style="font-weight: bold;color:#26536A;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">&nbsp;</div>
</li>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;color:#3A6A8B;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">DATE=`/bin/date <span style="color:#CC0000;">"+%F"</span>`</div>
</li>
<li style="font-weight: bold;color:#26536A;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">&nbsp;</div>
</li>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;color:#3A6A8B;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">/home/simon/integrit/integrit -C /home/simon/integrit/home.<span style="">conf</span> -cu | /usr/bin/mutt -s <span style="color:#CC0000;">"integrit - $DATE"</span> email@example.<span style="">com</span> </div>
</li>
</ol>
</div>
</div>
</div>
<p></p>
<p>Paste the above into a file called "run_integrit.sh" (a good place to put it would be in your ~/integrit directory), edit the paths to match your setup, and change the email address. Finally, make this file executable:</p>
<div class="igBar"><span id="lcode-14"><a href="#" onclick="javascript:showPlainTxt('code-14'); return false;">PLAIN TEXT</a></span></div>
<div class="syntax_hilite"><span class="langName">CODE:</span>
<div id="code-14">
<div class="code">
<ol>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;color:#3A6A8B;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">chmod +x ~/integrit/run_integrit.<span style="">sh</span> </div>
</li>
</ol>
</div>
</div>
</div>
<p></p>
<p>Now all we have to do is to add that to our crontab.</p>
<div class="igBar"><span id="lcode-15"><a href="#" onclick="javascript:showPlainTxt('code-15'); return false;">PLAIN TEXT</a></span></div>
<div class="syntax_hilite"><span class="langName">CODE:</span>
<div id="code-15">
<div class="code">
<ol>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;color:#3A6A8B;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">crontab -e </div>
</li>
</ol>
</div>
</div>
</div>
<p></p>
<p>...and add a line that looks something like this:</p>
<div class="igBar"><span id="lcode-16"><a href="#" onclick="javascript:showPlainTxt('code-16'); return false;">PLAIN TEXT</a></span></div>
<div class="syntax_hilite"><span class="langName">CODE:</span>
<div id="code-16">
<div class="code">
<ol>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;color:#3A6A8B;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;"><span style="color:#800000;color:#800000;">59</span>&nbsp; &nbsp; &nbsp;<span style="color:#800000;color:#800000;">21</span>&nbsp; &nbsp; &nbsp; *&nbsp; &nbsp; &nbsp; &nbsp;*&nbsp; &nbsp; &nbsp; &nbsp;*&nbsp; &nbsp; &nbsp;/home/simon/integrit/run_integrit.<span style="">sh</span> </div>
</li>
</ol>
</div>
</div>
</div>
<p></p>
<p>This will run integrit at 21.59 every day, if you don't know what that means, then have a google for "crontab tutorial".</p>
<p>Save the file, and you're off.</p>
<h2>Updating the database after valid changes:</h2>
<p>When you've changed or added a file yourself, then you'll need to update your <em>known</em> database with these changes. To do this, just generate a <em>current</em> database, and copy it over the old one. The script I've got above will automatically generate a <em>current</em> one, so you can just use that version, or repeat Step 4.</p>
<h2>Final considerations:</h2>
<p><strong>Note</strong>: If you can, you should run integrit (that is BOTH the database files and the binary files) off a "safe" partition, that's not writable. Unfortunately, most of us on shared hosting don't have that privilege, so just be aware that if a really clever attacker does get you, then they're likely to disable or modify the IDS if they can.</p>
<p>A good way of dealing with this is to copy your known file database off the webserver and make sure that the one on the server matches this one every so often.</p>
<p>--Simon</p>
]]></content:encoded>
			<wfw:commentRss>http://simon.net.nz/articles/simple-webserver-file-alteration-monitoring-using-integrit/feed/</wfw:commentRss>
		</item>
		<item>
		<title>Doom 2007 - Pacific settlement and Austronesian languages.</title>
		<link>http://simon.net.nz/articles/doom-2007-pacific-settlement-and-austronesian-languages/</link>
		<comments>http://simon.net.nz/articles/doom-2007-pacific-settlement-and-austronesian-languages/#comments</comments>
		<pubDate>Wed, 28 Mar 2007 12:26:54 +0000</pubDate>
		<dc:creator>Simon</dc:creator>
		
		<category><![CDATA[austronesian]]></category>

		<category><![CDATA[talks]]></category>

		<guid isPermaLink="false">http://simon.net.nz/articles/doom-2007-pacific-settlement-and-austronesian-languages/</guid>
		<description><![CDATA[The settlement of the Pacific is one of the greatest population movements in the last 10,000 years, and lead to the settlement of the region bounded by Taiwan, Hawaii, Easter Island (Rapanui), New Zealand, and Madagascar. This Austronesian expansion brought with it (and developed along the way) a distinctive Lapita cultural complex and what has become the largest language family in the world, with over 1,000 languages.

There are a number of scenarios describing this Austronesian expansion as either a rapid tree-like spread from Taiwan beginning around 6000 BP, or expansion from a deeper Island South-East Asia origin around 13,000 BP. Over the last few years we have built a large comparative database of linguistic information from these languages (Austronesian Basic Vocabulary Database) and have begun using phylogenetic methods on it. ]]></description>
			<content:encoded><![CDATA[<h1>Pacific settlement and Austronesian languages.</h1>
<h2><strong>Greenhill, S.J.</strong> &amp; R.D. Gray</h2>
<p><em>Department of Psychology, University of Auckland, New Zealand;</em></p>
<p class="highlight"> This is the abstract of a talk I gave at <a href="http://www.math.canterbury.ac.nz/bio/doom07/">Doom '07 - The Annual New Zealand Phylogenetics Meeting</a> at Whakapapa, New Zealand.</p>
<p>The settlement of the Pacific is one of the greatest  population movements in the last 10,000 years, and lead to the settlement of the region bounded by Taiwan, Hawaii, Easter Island (Rapanui), New Zealand, and Madagascar. This Austronesian expansion brought with it (and developed along the way) a distinctive Lapita cultural complex and what has become the largest language family in the world, with over 1,000 languages.</p>
<p>There are a number of scenarios  describing this Austronesian expansion as either a rapid tree-like spread from Taiwan beginning around 6000 BP, or expansion from a deeper Island South-East Asia origin around 13,000 BP. Over the last few years we have built a large comparative database of linguistic information from these languages (<a href="http://language.psy.auckland.ac.nz/austronesian">Austronesian Basic Vocabulary Database</a>) and have begun using phylogenetic methods on it.</p>
<p>We will present results from some large analyses of lexical data from over 300 languages, and demonstrate the power of this data at resolving these questions.</p>
<hr /><img src="http://simon.net.nz/files/abvd_hand.jpg" alt="Words for hand from ABVD" height="579" width="800" /></p>
<p>A picture from <a href="http://earth.google.com/">Google Earth</a>, showing the Austronesian words meaning "hand". Data from the <a href="http://language.psy.auckland.ac.nz/austronesian/">Austronesian Basic Vocabulary Database</a></p>
]]></content:encoded>
			<wfw:commentRss>http://simon.net.nz/articles/doom-2007-pacific-settlement-and-austronesian-languages/feed/</wfw:commentRss>
		</item>
		<item>
		<title>Query PubMed for citation information using a DOI and Python</title>
		<link>http://simon.net.nz/articles/query-pubmed-for-citation-information-using-a-doi-and-python/</link>
		<comments>http://simon.net.nz/articles/query-pubmed-for-citation-information-using-a-doi-and-python/#comments</comments>
		<pubDate>Mon, 29 Jan 2007 14:01:44 +0000</pubDate>
		<dc:creator>Simon</dc:creator>
		
		<category><![CDATA[code]]></category>

		<category><![CDATA[python]]></category>

		<guid isPermaLink="false">http://simon.net.nz/articles/query-pubmed-for-citation-information-using-a-doi-and-python/</guid>
		<description><![CDATA[Here's a simple little script to query PubMed for a Digitial Object Identifier (a DOI)]]></description>
			<content:encoded><![CDATA[<p>Here's a simple little script to query <a href="http://www.ncbi.nlm.nih.gov/entrez/query.fcgi">PubMed</a> for a <a href="http://en.wikipedia.org/wiki/Digital_object_identifier">Digitial Object Identifier</a> (a DOI)</p>
<p>Usage is quite simple, find a DOI somewhere, e.g. <em>10.1038/nature02029</em> (for <a href="http://language.psy.auckland.ac.nz/publications/index.php?pub=Gray_and_Atkinson2003Nature">this groundbreaking paper</a>), and run this:</p>
<div class="igBar"><span id="lcode-17"><a href="#" onclick="javascript:showPlainTxt('code-17'); return false;">PLAIN TEXT</a></span></div>
<div class="syntax_hilite"><span class="langName">CODE:</span>
<div id="code-17">
<div class="code">
<ol>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;color:#3A6A8B;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">lurch:~ python pythonquery.<span style="">py</span> <span style="color:#800000;color:#800000;">10</span>.<span style="color:#800000;color:#800000;">1038</span>/nature02029 </div>
</li>
</ol>
</div>
</div>
</div>
<p></p>
<p>... and via the magic of webservices and XML, and with a bit of luck, you'll get something like this back:</p>
<div class="igBar"><span id="lcode-18"><a href="#" onclick="javascript:showPlainTxt('code-18'); return false;">PLAIN TEXT</a></span></div>
<div class="syntax_hilite"><span class="langName">CODE:</span>
<div id="code-18">
<div class="code">
<ol>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;color:#3A6A8B;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">Language-tree divergence times support the Anatolian theory of Indo-European origin.</div>
</li>
<li style="font-weight: bold;color:#26536A;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">&nbsp;</div>
</li>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;color:#3A6A8B;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;"><span style="">Gray</span>, RD, Atkinson, QD</div>
</li>
<li style="font-weight: bold;color:#26536A;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">Nature <span style="color:#800000;color:#800000;">2003</span>, <span style="color:#800000;color:#800000;">426</span> <span style="color:#006600; font-weight:bold;">&#40;</span><span style="color:#800000;color:#800000;">6965</span><span style="color:#006600; font-weight:bold;">&#41;</span>:<span style="color:#800000;color:#800000;">435</span>-<span style="color:#800000;color:#800000;">9</span></div>
</li>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;color:#3A6A8B;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">&nbsp;</div>
</li>
<li style="font-weight: bold;color:#26536A;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">Languages, like genes, provide vital clues about human history. <span style="">The</span> origin of</div>
</li>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;color:#3A6A8B;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">the Indo-European language family is <span style="color:#CC0000;">"the most intensively studied, yet still</span></div>
</li>
<li style="font-weight: bold;color:#26536A;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;"><span style="color:#CC0000;">most recalcitrant, problem of historical linguistics"</span>. <span style="">Numerous</span> genetic studies</div>
</li>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;color:#3A6A8B;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">of Indo-European origins have also produced inconclusive results. <span style="">Here</span> we</div>
</li>
<li style="font-weight: bold;color:#26536A;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">analyse linguistic data using computational methods derived from evolutionary</div>
</li>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;color:#3A6A8B;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">biology. <span style="">We</span> test two theories of Indo-European origin: the <span style="color:#CC0000;">'Kurgan expansion'</span></div>
</li>
<li style="font-weight: bold;color:#26536A;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">and the <span style="color:#CC0000;">'Anatolian farming'</span> hypotheses. <span style="">The</span> Kurgan theory centres on possible</div>
</li>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;color:#3A6A8B;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">archaeological evidence for an expansion into Europe and the Near East by</div>
</li>
<li style="font-weight: bold;color:#26536A;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">Kurgan horsemen beginning in the sixth millennium BP. <span style="">In</span> contrast, the Anatolian</div>
</li>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;color:#3A6A8B;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">theory claims that Indo-European languages expanded with the spread of</div>
</li>
<li style="font-weight: bold;color:#26536A;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">agriculture from Anatolia around <span style="color:#800000;color:#800000;">8</span>,<span style="color:#800000;color:#800000;">000</span>-<span style="color:#800000;color:#800000;">9</span>,<span style="color:#800000;color:#800000;">500</span> years bp. <span style="">In</span> striking agreement</div>
</li>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;color:#3A6A8B;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">with the Anatolian hypothesis, our analysis of a matrix of <span style="color:#800000;color:#800000;">87</span> languages with</div>
</li>
<li style="font-weight: bold;color:#26536A;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;"><span style="color:#800000;color:#800000;">2</span>,<span style="color:#800000;color:#800000;">449</span> lexical items produced an estimated age range for the initial Indo-European</div>
</li>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;color:#3A6A8B;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">divergence of between <span style="color:#800000;color:#800000;">7</span>,<span style="color:#800000;color:#800000;">800</span> and <span style="color:#800000;color:#800000;">9</span>,<span style="color:#800000;color:#800000;">800</span> years bp. <span style="">These</span> results were robust to</div>
</li>
<li style="font-weight: bold;color:#26536A;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">changes in coding procedures, calibration points, rooting of the trees and priors</div>
</li>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;color:#3A6A8B;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">in the bayesian analysis. </div>
</li>
</ol>
</div>
</div>
</div>
<p></p>
<h2>The Code:</h2>
<div class="igBar"><span id="lpython-19"><a href="#" onclick="javascript:showPlainTxt('python-19'); return false;">PLAIN TEXT</a></span></div>
<div class="syntax_hilite"><span class="langName">PYTHON:</span>
<div id="python-19">
<div class="python">
<ol>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;color:#3A6A8B;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;"><span style="color: #808080; font-style: italic;">#!/usr/bin/env python</span></div>
</li>
<li style="font-weight: bold;color:#26536A;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">&nbsp;</div>
</li>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;color:#3A6A8B;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;"><span style="color: #808080; font-style: italic;">#&nbsp; &nbsp;Simple script to query pubmed for a DOI</span></div>
</li>
<li style="font-weight: bold;color:#26536A;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;"><span style="color: #808080; font-style: italic;">#&nbsp; &nbsp;(c) Simon Greenhill, 2007</span></div>
</li>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;color:#3A6A8B;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;"><span style="color: #808080; font-style: italic;">#&nbsp; &nbsp;http://simon.net.nz/</span></div>
</li>
<li style="font-weight: bold;color:#26536A;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">&nbsp;</div>
</li>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;color:#3A6A8B;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;"><span style="color: #ff7700;font-weight:bold;">import</span> <span style="color: #dc143c;">urllib</span></div>
</li>
<li style="font-weight: bold;color:#26536A;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;"><span style="color: #ff7700;font-weight:bold;">from</span> <span style="color: #dc143c;">xml</span>.<span style="color: black;">dom</span> <span style="color: #ff7700;font-weight:bold;">import</span> minidom</div>
</li>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;color:#3A6A8B;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">&nbsp;</div>
</li>
<li style="font-weight: bold;color:#26536A;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;"><span style="color: #ff7700;font-weight:bold;">def</span> get_citation_from_doi<span style="color: black;">&#40;</span>query, <span style="color: #dc143c;">email</span>=<span style="color: #483d8b;">'YOUR EMAIL GOES HERE'</span>, tool=<span style="color: #483d8b;">'SimonsPythonQuery'</span>, database=<span style="color: #483d8b;">'pubmed'</span><span style="color: black;">&#41;</span>:</div>
</li>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;color:#3A6A8B;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">params = <span style="color: black;">&#123;</span></div>
</li>
<li style="font-weight: bold;color:#26536A;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;"><span style="color: #483d8b;">'db'</span>:database,</div>
</li>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;color:#3A6A8B;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;"><span style="color: #483d8b;">'tool'</span>:tool,</div>
</li>
<li style="font-weight: bold;color:#26536A;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;"><span style="color: #483d8b;">'email'</span>:<span style="color: #dc143c;">email</span>,</div>
</li>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;color:#3A6A8B;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;"><span style="color: #483d8b;">'term'</span>:query,</div>
</li>
<li style="font-weight: bold;color:#26536A;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;"><span style="color: #483d8b;">'usehistory'</span>:<span style="color: #483d8b;">'y'</span>,</div>
</li>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;color:#3A6A8B;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;"><span style="color: #483d8b;">'retmax'</span>:<span style="color: #ff4500;color:#800000;">1</span></div>
</li>
<li style="font-weight: bold;color:#26536A;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;"><span style="color: black;">&#125;</span></div>
</li>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;color:#3A6A8B;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;"><span style="color: #808080; font-style: italic;"># try to resolve the PubMed ID of the DOI</span></div>
</li>
<li style="font-weight: bold;color:#26536A;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">url = <span style="color: #483d8b;">'http://eutils.ncbi.nlm.nih.gov/entrez/eutils/esearch.fcgi?'</span> + <span style="color: #dc143c;">urllib</span>.<span style="color: black;">urlencode</span><span style="color: black;">&#40;</span>params<span style="color: black;">&#41;</span></div>
</li>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;color:#3A6A8B;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">data = <span style="color: #dc143c;">urllib</span>.<span style="color: black;">urlopen</span><span style="color: black;">&#40;</span>url<span style="color: black;">&#41;</span>.<span style="color: black;">read</span><span style="color: black;">&#40;</span><span style="color: black;">&#41;</span></div>
</li>
<li style="font-weight: bold;color:#26536A;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">&nbsp;</div>
</li>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;color:#3A6A8B;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;"><span style="color: #808080; font-style: italic;"># parse XML output from PubMed...</span></div>
</li>
<li style="font-weight: bold;color:#26536A;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">xmldoc = minidom.<span style="color: black;">parseString</span><span style="color: black;">&#40;</span>data<span style="color: black;">&#41;</span></div>
</li>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;color:#3A6A8B;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">ids = xmldoc.<span style="color: black;">getElementsByTagName</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">'Id'</span><span style="color: black;">&#41;</span></div>
</li>
<li style="font-weight: bold;color:#26536A;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">&nbsp;</div>
</li>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;color:#3A6A8B;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;"><span style="color: #808080; font-style: italic;"># nothing found, exit</span></div>
</li>
<li style="font-weight: bold;color:#26536A;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;"><span style="color: #ff7700;font-weight:bold;">if</span> <span style="color: #008000;">len</span><span style="color: black;">&#40;</span>ids<span style="color: black;">&#41;</span> == <span style="color: #ff4500;color:#800000;">0</span>:</div>
</li>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;color:#3A6A8B;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;"><span style="color: #ff7700;font-weight:bold;">raise</span> <span style="color: #483d8b;">"DoiNotFound"</span></div>
</li>
<li style="font-weight: bold;color:#26536A;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">&nbsp;</div>
</li>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;color:#3A6A8B;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;"><span style="color: #808080; font-style: italic;"># get ID</span></div>
</li>
<li style="font-weight: bold;color:#26536A;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;"><span style="color: #008000;">id</span> = ids<span style="color: black;">&#91;</span><span style="color: #ff4500;color:#800000;">0</span><span style="color: black;">&#93;</span>.<span style="color: black;">childNodes</span><span style="color: black;">&#91;</span><span style="color: #ff4500;color:#800000;">0</span><span style="color: black;">&#93;</span>.<span style="color: black;">data</span></div>
</li>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;color:#3A6A8B;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">&nbsp;</div>
</li>
<li style="font-weight: bold;color:#26536A;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;"><span style="color: #808080; font-style: italic;"># remove unwanted parameters</span></div>
</li>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;color:#3A6A8B;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">params.<span style="color: black;">pop</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">'term'</span><span style="color: black;">&#41;</span></div>
</li>
<li style="font-weight: bold;color:#26536A;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">params.<span style="color: black;">pop</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">'usehistory'</span><span style="color: black;">&#41;</span></div>
</li>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;color:#3A6A8B;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">params.<span style="color: black;">pop</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">'retmax'</span><span style="color: black;">&#41;</span></div>
</li>
<li style="font-weight: bold;color:#26536A;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;"><span style="color: #808080; font-style: italic;"># and add new ones...</span></div>
</li>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;color:#3A6A8B;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">params<span style="color: black;">&#91;</span><span style="color: #483d8b;">'id'</span><span style="color: black;">&#93;</span> = <span style="color: #008000;">id</span></div>
</li>
<li style="font-weight: bold;color:#26536A;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">&nbsp;</div>
</li>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;color:#3A6A8B;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">params<span style="color: black;">&#91;</span><span style="color: #483d8b;">'retmode'</span><span style="color: black;">&#93;</span> = <span style="color: #483d8b;">'xml'</span></div>
</li>
<li style="font-weight: bold;color:#26536A;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">&nbsp;</div>
</li>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;color:#3A6A8B;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;"><span style="color: #808080; font-style: italic;"># get citation info:</span></div>
</li>
<li style="font-weight: bold;color:#26536A;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">url = <span style="color: #483d8b;">'http://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi?'</span> + <span style="color: #dc143c;">urllib</span>.<span style="color: black;">urlencode</span><span style="color: black;">&#40;</span>params<span style="color: black;">&#41;</span></div>
</li>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;color:#3A6A8B;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">data = <span style="color: #dc143c;">urllib</span>.<span style="color: black;">urlopen</span><span style="color: black;">&#40;</span>url<span style="color: black;">&#41;</span>.<span style="color: black;">read</span><span style="color: black;">&#40;</span><span style="color: black;">&#41;</span></div>
</li>
<li style="font-weight: bold;color:#26536A;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">&nbsp;</div>
</li>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;color:#3A6A8B;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;"><span style="color: #ff7700;font-weight:bold;">return</span> data</div>
</li>
<li style="font-weight: bold;color:#26536A;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">&nbsp;</div>
</li>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;color:#3A6A8B;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;"><span style="color: #ff7700;font-weight:bold;">def</span> text_output<span style="color: black;">&#40;</span><span style="color: #dc143c;">xml</span><span style="color: black;">&#41;</span>:</div>
</li>
<li style="font-weight: bold;color:#26536A;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;"><span style="color: #483d8b;">""</span><span style="color: #483d8b;">"Makes a simple text output from the XML returned from efetch"</span><span style="color: #483d8b;">""</span></div>
</li>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;color:#3A6A8B;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">&nbsp;</div>
</li>
<li style="font-weight: bold;color:#26536A;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">xmldoc = minidom.<span style="color: black;">parseString</span><span style="color: black;">&#40;</span><span style="color: #dc143c;">xml</span><span style="color: black;">&#41;</span></div>
</li>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;color:#3A6A8B;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">&nbsp;</div>
</li>
<li style="font-weight: bold;color:#26536A;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">title = xmldoc.<span style="color: black;">getElementsByTagName</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">'ArticleTitle'</span><span style="color: black;">&#41;</span><span style="color: black;">&#91;</span><span style="color: #ff4500;color:#800000;">0</span><span style="color: black;">&#93;</span></div>
</li>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;color:#3A6A8B;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">title = title.<span style="color: black;">childNodes</span><span style="color: black;">&#91;</span><span style="color: #ff4500;color:#800000;">0</span><span style="color: black;">&#93;</span>.<span style="color: black;">data</span></div>
</li>
<li style="font-weight: bold;color:#26536A;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">&nbsp;</div>
</li>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;color:#3A6A8B;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">abstract = xmldoc.<span style="color: black;">getElementsByTagName</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">'AbstractText'</span><span style="color: black;">&#41;</span><span style="color: black;">&#91;</span><span style="color: #ff4500;color:#800000;">0</span><span style="color: black;">&#93;</span></div>
</li>
<li style="font-weight: bold;color:#26536A;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">abstract = abstract.<span style="color: black;">childNodes</span><span style="color: black;">&#91;</span><span style="color: #ff4500;color:#800000;">0</span><span style="color: black;">&#93;</span>.<span style="color: black;">data</span></div>
</li>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;color:#3A6A8B;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">&nbsp;</div>
</li>
<li style="font-weight: bold;color:#26536A;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">authors = xmldoc.<span style="color: black;">getElementsByTagName</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">'AuthorList'</span><span style="color: black;">&#41;</span><span style="color: black;">&#91;</span><span style="color: #ff4500;color:#800000;">0</span><span style="color: black;">&#93;</span></div>
</li>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;color:#3A6A8B;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">authors = authors.<span style="color: black;">getElementsByTagName</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">'Author'</span><span style="color: black;">&#41;</span></div>
</li>
<li style="font-weight: bold;color:#26536A;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">authorlist = <span style="color: black;">&#91;</span><span style="color: black;">&#93;</span></div>
</li>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;color:#3A6A8B;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;"><span style="color: #ff7700;font-weight:bold;">for</span> author <span style="color: #ff7700;font-weight:bold;">in</span> authors:</div>
</li>
<li style="font-weight: bold;color:#26536A;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">LastName = author.<span style="color: black;">getElementsByTagName</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">'LastName'</span><span style="color: black;">&#41;</span><span style="color: black;">&#91;</span><span style="color: #ff4500;color:#800000;">0</span><span style="color: black;">&#93;</span>.<span style="color: black;">childNodes</span><span style="color: black;">&#91;</span><span style="color: #ff4500;color:#800000;">0</span><span style="color: black;">&#93;</span>.<span style="color: black;">data</span></div>
</li>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;color:#3A6A8B;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">Initials = author.<span style="color: black;">getElementsByTagName</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">'Initials'</span><span style="color: black;">&#41;</span><span style="color: black;">&#91;</span><span style="color: #ff4500;color:#800000;">0</span><span style="color: black;">&#93;</span>.<span style="color: black;">childNodes</span><span style="color: black;">&#91;</span><span style="color: #ff4500;color:#800000;">0</span><span style="color: black;">&#93;</span>.<span style="color: black;">data</span></div>
</li>
<li style="font-weight: bold;color:#26536A;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">author = <span style="color: #483d8b;">'%s, %s'</span> % <span style="color: black;">&#40;</span>LastName, Initials<span style="color: black;">&#41;</span></div>
</li>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;color:#3A6A8B;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">authorlist.<span style="color: black;">append</span><span style="color: black;">&#40;</span>author<span style="color: black;">&#41;</span></div>
</li>
<li style="font-weight: bold;color:#26536A;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">&nbsp;</div>
</li>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;color:#3A6A8B;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">journalinfo = xmldoc.<span style="color: black;">getElementsByTagName</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">'Journal'</span><span style="color: black;">&#41;</span><span style="color: black;">&#91;</span><span style="color: #ff4500;color:#800000;">0</span><span style="color: black;">&#93;</span></div>
</li>
<li style="font-weight: bold;color:#26536A;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">journal = journalinfo.<span style="color: black;">getElementsByTagName</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">'Title'</span><span style="color: black;">&#41;</span><span style="color: black;">&#91;</span><span style="color: #ff4500;color:#800000;">0</span><span style="color: black;">&#93;</span>.<span style="color: black;">childNodes</span><span style="color: black;">&#91;</span><span style="color: #ff4500;color:#800000;">0</span><span style="color: black;">&#93;</span>.<span style="color: black;">data</span></div>
</li>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;color:#3A6A8B;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">journalinfo = journalinfo.<span style="color: black;">getElementsByTagName</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">'JournalIssue'</span><span style="color: black;">&#41;</span><span style="color: black;">&#91;</span><span style="color: #ff4500;color:#800000;">0</span><span style="color: black;">&#93;</span></div>
</li>
<li style="font-weight: bold;color:#26536A;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">volume = journalinfo.<span style="color: black;">getElementsByTagName</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">'Volume'</span><span style="color: black;">&#41;</span><span style="color: black;">&#91;</span><span style="color: #ff4500;color:#800000;">0</span><span style="color: black;">&#93;</span>.<span style="color: black;">childNodes</span><span style="color: black;">&#91;</span><span style="color: #ff4500;color:#800000;">0</span><span style="color: black;">&#93;</span>.<span style="color: black;">data</span></div>
</li>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;color:#3A6A8B;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">issue = journalinfo.<span style="color: black;">getElementsByTagName</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">'Issue'</span><span style="color: black;">&#41;</span><span style="color: black;">&#91;</span><span style="color: #ff4500;color:#800000;">0</span><span style="color: black;">&#93;</span>.<span style="color: black;">childNodes</span><span style="color: black;">&#91;</span><span style="color: #ff4500;color:#800000;">0</span><span style="color: black;">&#93;</span>.<span style="color: black;">data</span></div>
</li>
<li style="font-weight: bold;color:#26536A;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">year = journalinfo.<span style="color: black;">getElementsByTagName</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">'Year'</span><span style="color: black;">&#41;</span><span style="color: black;">&#91;</span><span style="color: #ff4500;color:#800000;">0</span><span style="color: black;">&#93;</span>.<span style="color: black;">childNodes</span><span style="color: black;">&#91;</span><span style="color: #ff4500;color:#800000;">0</span><span style="color: black;">&#93;</span>.<span style="color: black;">data</span></div>
</li>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;color:#3A6A8B;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">&nbsp;</div>
</li>
<li style="font-weight: bold;color:#26536A;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;"><span style="color: #808080; font-style: italic;"># this is a bit odd?</span></div>
</li>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;color:#3A6A8B;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">pages = xmldoc.<span style="color: black;">getElementsByTagName</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">'MedlinePgn'</span><span style="color: black;">&#41;</span><span style="color: black;">&#91;</span><span style="color: #ff4500;color:#800000;">0</span><span style="color: black;">&#93;</span>.<span style="color: black;">childNodes</span><span style="color: black;">&#91;</span><span style="color: #ff4500;color:#800000;">0</span><span style="color: black;">&#93;</span>.<span style="color: black;">data</span></div>
</li>
<li style="font-weight: bold;color:#26536A;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">&nbsp;</div>
</li>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;color:#3A6A8B;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">output = <span style="color: black;">&#91;</span><span style="color: black;">&#93;</span></div>
</li>
<li style="font-weight: bold;color:#26536A;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">output.<span style="color: black;">append</span><span style="color: black;">&#40;</span>title<span style="color: black;">&#41;</span></div>
</li>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;color:#3A6A8B;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">output.<span style="color: black;">append</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">''</span><span style="color: black;">&#41;</span> <span style="color: #808080; font-style: italic;">#empty line</span></div>
</li>
<li style="font-weight: bold;color:#26536A;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">output.<span style="color: black;">append</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">', '</span>.<span style="color: black;">join</span><span style="color: black;">&#40;</span>authorlist<span style="color: black;">&#41;</span><span style="color: black;">&#41;</span></div>
</li>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;color:#3A6A8B;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">output.<span style="color: black;">append</span><span style="color: black;">&#40;</span> <span style="color: #483d8b;">'%s %s, %s (%s):%s'</span> % <span style="color: black;">&#40;</span>journal, year, volume, issue, pages<span style="color: black;">&#41;</span> <span style="color: black;">&#41;</span></div>
</li>
<li style="font-weight: bold;color:#26536A;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">output.<span style="color: black;">append</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">''</span><span style="color: black;">&#41;</span> <span style="color: #808080; font-style: italic;">#empty line</span></div>
</li>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;color:#3A6A8B;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">output.<span style="color: black;">append</span><span style="color: black;">&#40;</span>abstract<span style="color: black;">&#41;</span></div>
</li>
<li style="font-weight: bold;color:#26536A;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;"><span style="color: #ff7700;font-weight:bold;">return</span> output</div>
</li>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;color:#3A6A8B;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">&nbsp;</div>
</li>
<li style="font-weight: bold;color:#26536A;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;"><span style="color: #ff7700;font-weight:bold;">if</span> __name__ == <span style="color: #483d8b;">'__main__'</span>:</div>
</li>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;color:#3A6A8B;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;"><span style="color: #ff7700;font-weight:bold;">from</span> <span style="color: #dc143c;">sys</span> <span style="color: #ff7700;font-weight:bold;">import</span> argv, exit</div>
</li>
<li style="font-weight: bold;color:#26536A;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;"><span style="color: #ff7700;font-weight:bold;">if</span> <span style="color: #008000;">len</span><span style="color: black;">&#40;</span>argv<span style="color: black;">&#41;</span> == <span style="color: #ff4500;color:#800000;">1</span>:</div>
</li>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;color:#3A6A8B;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;"><span style="color: #ff7700;font-weight:bold;">print</span> <span style="color: #483d8b;">'Usage: %s &lt;query&gt;'</span> % argv<span style="color: black;">&#91;</span><span style="color: #ff4500;color:#800000;">0</span><span style="color: black;">&#93;</span></div>
</li>
<li style="font-weight: bold;color:#26536A;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;"><span style="color: #ff7700;font-weight:bold;">print</span> <span style="color: #483d8b;">' e.g. %s 10.1038/ng1946'</span> % argv<span style="color: black;">&#91;</span><span style="color: #ff4500;color:#800000;">0</span><span style="color: black;">&#93;</span></div>
</li>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;color:#3A6A8B;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">exit<span style="color: black;">&#40;</span><span style="color: black;">&#41;</span>&lt;/query&gt;</div>
</li>
<li style="font-weight: bold;color:#26536A;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">&nbsp;</div>
</li>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;color:#3A6A8B;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">citation = get_citation_from_doi<span style="color: black;">&#40;</span>argv<span style="color: black;">&#91;</span><span style="color: #ff4500;color:#800000;">1</span><span style="color: black;">&#93;</span><span style="color: black;">&#41;</span></div>
</li>
<li style="font-weight: bold;color:#26536A;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;"><span style="color: #ff7700;font-weight:bold;">for</span> line <span style="color: #ff7700;font-weight:bold;">in</span> text_output<span style="color: black;">&#40;</span>citation<span style="color: black;">&#41;</span>:</div>
</li>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;color:#3A6A8B;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;"><span style="color: #ff7700;font-weight:bold;">print</span> line </div>
</li>
</ol>
</div>
</div>
</div>
<p></p>
<p>--Simon</p>
]]></content:encoded>
			<wfw:commentRss>http://simon.net.nz/articles/query-pubmed-for-citation-information-using-a-doi-and-python/feed/</wfw:commentRss>
		</item>
	</channel>
</rss>

<!-- Dynamic Page Served (once) in 1.043 seconds -->
