next up previous
Next: 4. What are They Up: Who is Interested in Repository Previous: 2. The Stony Brook

3. What are People Looking For?

Out of the 1.5 million hits recorded on this site over the one year interval, 393,467 of them were to primary html and shtml files. This latter count more accurately represents the number of mouse-clicks performed by users than the total hits, since most of the remaining hits are on image files associated with these pages. Therefore, we will limit further analysis to hits on these files.

Because user ID information is not logged on our WWW server, it is difficult to judge exactly how many different people accounted for these hits. Based on the roster of machines which accessed the site, I estimate that roughly 40,000 different people paid a visit during this 10 week study. Some fraction of hits came from webcrawler robots instead of human users, however I believe they had only a minor effect on our statistics. Observe that the least frequently clicked shtml file (containing the copyright notice for the site) was hit only 298 times versus 17,733 hits for the most frequently accessed page (the front page).

Table 1: Hits by Major Section Index
Problem Category Index Hits Subsection Hits Problems
Data Structures 6698 14492 6
Numerical Problems 4499 12812 11
Combinatorial Problems 3419 10114 10
Graph Problems: Polynomial 4496 17492 12
Graph Problems: Hard 3849 13447 11
Computational Geometry 7031 25129 16
Set and String Problems 3003 10776 9
Totals 32995 104262 75

Table 1 reports the number of hits distributed among our highest level of classification - the seven major subfields of algorithms. Two different hit measures are reported for each subfield, first the number of hits to the menu of problems within the subfield, and second the total number of hits to individual problem pages within this subfield. Computational geometry proved to be the most popular subfield by both measures, although outweighed by the interest in graph problems split across two subtopics. Data structures recorded the highest ``per-problem'' interest, but I was surprised by the relative lack of enthusiasm for set and string algorithms.

Table 2: Hits by Programming Language Index
Programming Language Index Hits Implementations
C language 4776 37
C++ 5397 11
Fortran 868 6
Lisp 698 1
Mathematica 652 3
Pascal 1556 5
Totals 13947 63

Table 2 reports the number of hits distributed among the various programming language submenus. C++ seems to have supplanted C as the most popular programming language among developers, although there is clearly a lag in the size of the body of software written in C++. C remains the source language for over half the implementations available on the Algorithm Repository. User interest in Mathematica rivals that of Fortran, perhaps suggesting that computer algebra systems are becoming the language of choice for scientific computation. There was no submenu associated with Java, reflecting what was available when I built the repository. The total number of implementations in Table 2 is greater than 56 because seven codes are written in more than one language.

Table 3: Most and least popular algorithmic problems, by repository hits.
Most Popular Problems Hits Least Popular Problems Hits
shortest-path 3660 generating-subsets 854
kd-trees 3198 edge-coloring 817
dictionaries 3022 satisfiability 792
traveling-salesman 2963 independent-set 789
convex-hull 2963 cryptography 786
nearest-neighbor 2936 text-compression 767
minimum-spanning-tree 2922 maintaining-arrangements 727
voronoi-diagrams 2815 set-packing 703
triangulations 2786 planar-drawing 678
sorting 2734 median 671
graph-data-structures 2596 bandwidth 629
string-matching 2304 factoring-integers 628
suffix-trees 2213 shortest-common-superstring 520
priority-queues 2208 determinants 520
geometric-primitives 2162 feedback-set 483

Table 3 reports the 15 most popular and least popular algorithmic problems, as measured by the number of hits the associated pages received. Hit counts for all of the 75 problems appears in Table 6. Several observations can be drawn from this data:

It is interesting to note that only 17,733 hits occurred to the front-page of the site, which suggests that most visitors never saw the main index of the site. This implies that most users initially entered the site through a keyword-oriented search engine, and gives credence to the notation that these hits measure problem interest more than just directionless wandering through the site.

next up previous
Next: 4. What are They Up: Who is Interested in Repository Previous: 2. The Stony Brook
Steve Skiena