Asides

Creating a Threshold Slider

I wanted to modify this script by Neal Caren to create an adjustable graph that allows you to control the threshold of citations for nodes that will appear on the graph. If for example, you wanted to see only those nodes with twenty or more citations, you can just move the slider over to see those, and the data will automatically update.

I have created three of these: Modernist Journals, Literary Theory, and Rhetoric and Composition. I’m sure there are several ways of going about doing this, and I’m equally as sure that mine is far from the most efficient or practical.

I initially thought that creating one json file with the lowest threshold desirable with the script and then modifying the javascript file that creates the force-directed graph would be the easiest way. My lack of familiarity with both javascript and the internal data structures of D3.js soon led me to abandon this approach, however. I decided that it would be easier to generate separate json files for each citation threshold.

To do this requires modfiying Caren’s script. Add the following lines to the beginning:

Additions
1
2
3
import sys
threshold = int(sys.argv[1])
name = sys.argv[2]

Then find these lines in Caren’s script

Line to Change
1
2
3
article_list=open('topfourcites.txt').read()

if edge_dict[edge]>3 and cite_dict[edge[0]]>=8 and cite_dict[edge[1]]>=8 :

And change them to:

Change
1
2
3
article_list=open(name).read()

if edge_dict[edge]>3 and cite_dict[edge[0]]>= threshold and cite_dict[edge[1]]>= threshold :

This allows you to send arguments to the script to change the filename and also the allowable citation threshold. The next step is to use a simple script to iterate over the range of desired thresholds:

Perl Shell Script
1
2
3
4
5
$name=$ARGV[0];
for $i (3..20) {
  `python try.py $i $name`;
  `mv netweb/cites.json netweb/cites-$i.json`;
}

The arguments are set to 3 and 20 here, but these can be easily adjusted (or easily modified to be able be called from the script itself). You also call the script with the name of the data file downloaded from Web of Science. This script will then generate a series of json files labeled cites-3.json, cites-4.json, etc.

The trick now is changes the cites.js code to enable the updates. You have to wrap the entire code that generates the force-directed layout in a function. I call this updateData():

Cites.js Changes
1
2
3
4
5
6
7
8
9
function updateData() {
if (first==0) {
first=1;
}
else {

    d3.select("#old").remove();

}

Feel free to laugh at the first check; I really didn’t know the syntax for first page loads in javascript, as risible as it sounds. The d3.select command clears the old network graph before the new one is rendered. You have to add a .attr(“id”,”old”) to the d3.select(“#chart”) definition for this to work.

You then need to add this line before the d3.json definition:

Load File Name
1
var file="cites-"+document.getElementById('id1').value+".json"

Then at the end of the file, add this:

Final Additions
1
2
3
4
5
6
7
8
9
10
}
var first=0;
updateData();
d3.selectAll("input").on("change", change);
function change() {
console.log("Change");
console.log(this.value);

updateData();
}

The first ending bracket is for the updateData function at the beginning. You can also leave out the console logs if you choose. The final step is to add the slider code to the cites.html file:

Slider Code
1
2
3
4
5
<form onsubmit="return false" oninput="level.value = id1.valueAsNumber">

  <label for "id1">Citation Threshold</label>  <input type="range" min="3" max="20" value="12" step="1" id="id1">
<output for="id1" name="level">12</output>/20
  </form>

The min and max should be adjusted to match your own thresholds, and the starting value should also be altered for your desired starting point.

I’m @joncgoodwin on twitter and gmail if you have any questions or suggestions.

Dying Rabbits

I checked back in to Project Rosalind a few days ago, and I noticed that they had added several new problems. One was the familiar Fibonacci sequence, beloved of introdutory computer science instruction everywhere. There was also a modified version of the Fibonacci problem, however, which requires you to compute the sequence with mortal rabbits. (The normal Fibonacci sequence is often introduced as an unrealistic problem in modeling the population growth of immortal rabbits.)

I wanted to find a solution to this that didn’t involve manually keeping track of how many rabbits were breeding and dying, and it turned out to be more complicated than I originally thought. Brother U. Alfred tackled it in an early issue of The Fibonacci Quarterly “Dying Rabbit Problem Revived”, only to be rebuked somewhat harshly by John H. E. Cohn in a subsequent issue. V. E. Hoggatt, Jr. and D. A. Lind devised a simple-seeming solution a few years later that I found quite difficult to compute. Luckily enough, this paper by Antonio M. Oller included some Maple code that I was able to translate into perl:

Maple Translation
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
$h=2;
$n=193; #(Subtract one from offered values, indexing)
$k=17; # As above
use bignum;
for $i (0..($h-1)) {
  $c[$i]=1;

}

for $i ($h..($k+$h+2)) {
  $c[$i]=$c[$i-1]+$c[$i-$h];
}

for $i ((($k+$h)-1)..$n) {
  $counter=0;
  for $j (($i-$k-$h+1)..($i-$h)) {
    $counter=$counter+$c[$j];
  }
  $c[$i]=$counter;
}

print $c[$n];

There is a very short python solution and Common Lisp solution posted on the solutions page (which you can only see after you have solved the problem) that I don’t understand at all, so there are clearly many other ways.

First Octopress

Some various points of interest:

Snippet - snippet
1
m <- ggplot(my.topic, aes(x=year, y=prop))