This data was generated using an experimental algorithm inspired by k-means clustering, but designed specifically for the task of detecting posts within an arxiv category but without an arxiv tag. Parameters were chosen to be very strict so that only posts that are clearly within an arxiv category would be detected. I think looser parameters would be more useful for Yla's purposes, but this is the only data I currently have. I might try to repeat the experiment later this month.

ag.algebraic-geometry (2107, 0)

0.989 ag.algebraic-geometry

0.063 reference-request

at.algebraic-topology (846, 46)

0.932 at.algebraic-topology

0.306 homotopy-theory

0.101 ag.algebraic-geometry

0.076 cohomology

0.056 simplicial-stuff

0.054 model-categories

0.052 homological-algebra

ap.analysis-of-pdes (220, 98)

0.731 differential-equations

0.651 ap.analysis-of-pdes

0.128 analysis

0.091 elliptic-pde

0.080 reference-request

ct.category-theory (836, 0)

0.981 ct.category-theory

0.084 ag.algebraic-geometry

0.075 higher-category-theory

0.053 topoi

0.050 at.algebraic-topology

ca.classical-analysis (356, 0)

0.971 ca.classical-analysis

0.109 fa.functional-analysis

0.090 analysis

0.076 fourier-analysis

0.068 real-analysis

0.068 calculus

0.052 cv.complex-variables

0.052 ap.analysis-of-pdes

co.combinatorics (932, 0)

0.958 co.combinatorics

0.226 graph-theory

0.111 nt.number-theory

0.065 reference-request

0.050 algorithms

ac.commutative-algebra (702, 0)

0.921 ac.commutative-algebra

0.374 ag.algebraic-geometry

0.063 homological-algebra

cv.complex-variables (329, 110)

0.829 cv.complex-variables

0.356 complex-geometry

0.318 complex-analysis

0.250 ag.algebraic-geometry

0.064 dg.differential-geometry

0.061 riemann-surfaces

0.057 reference-request

0.053 analysis

dg.differential-geometry (779, 37)

0.957 dg.differential-geometry

0.190 riemannian-geometry

0.139 ag.algebraic-geometry

0.074 complex-geometry

0.067 at.algebraic-topology

0.061 lie-groups

0.059 reference-request

ds.dynamical-systems (227, 12)

0.946 ds.dynamical-systems

0.260 ergodic-theory

0.106 reference-request

0.053 classical-mechanics

0.053 dg.differential-geometry

fa.functional-analysis (478, 26)

0.950 fa.functional-analysis

0.231 banach-spaces

0.095 reference-request

0.084 measure-theory

0.074 fourier-analysis

0.067 analysis

0.065 pr.probability

Combinatorics and Algebra seem to be the exact same color, so do applied and statistics, geometry and number theory while different are too far from the graph itself and I cannot tell which line is which tag.

David Roberts suggestion to use dashed and dotted lines as well is a wonderful idea.

Thanks, Yla, for these graphs! (talk to you soon :)

Thanks! :-)

]]>As far as double counting. I am not double counting a post if it has two tags from the same category (e.g. if a post has 'set-theory' and 'logic' then foundations gets increased by 1 post). However I am allowing a post to count in two areas (e.g. if a post has 'logic' and 'algorithms' it will be counted for foundations and applied). Conceptually I think this makes sense because a post can legitimately be about two areas. I now have category theory only in foundations (not really sure where to put it but I am not sure it deserves it's own category).

As JDH suggests I added the suggested tags to foundations. This increased the counts by a factor of ~1.7. I don't think limiting the tags to the top 50 is a great method. I wish that there were a good method for automatically clustering tags. I haven't tried any clustering algorithms, but my guess is that the overlap in the areas and the meaninglessness of some tags with respect to area (e.g. reference-request) might make such an algorithm difficult.

Still, most of the representation theory questions on M.SE are undergraduate level. I can't say there was much migration from MO to M.SE here.

]]>I noticed that you have ct.category-theory in three of your categories; perhaps it would make sense for it to be its own category?

For your foundations grouping, I would suggest:

Foundations: lo.logic, set-theory, model-theory, computability-theory, proof-theory, cs.cc.complexity-theory, forcing, order-theory, axiom-of-choice, large-cardinals, math-philosophy, ultrafilters, boolean-algebra, metamathematics, foundations, nonstandard-analysis, but probably several of these are not in the top 50 tags.

Perhaps the methodology of using the top 50 tags is not sound, in that some subjects have dispersed themselves more finely into many tags, and so such subjects would be under-represented in your data.]]>

- The category Combinatorics & Discrete Math shows more than twice the posts of any other category. (But see quid's correction later: I misinterpreted the colors!)
- There is a distinct downward trend in all categories in the last six months or so.

My guess is the latter is due to the influence of MSE. Perhaps the activity level in MO overall is decreasing?

I do think the categories could group the tags differently. For example, number theory should likely be separated from algebra. But regardless, very interesting data!

Here are my categories:

combinatorics & discrete math: co.combinatorics, graph-theory

foundations: set-theory, lo.logic, ct.category-theory

statistics & probability: pr.probability, st.statistics, measure-theory

numerical analysis & applied math: linear-algebra, mp.mathematical-physics, algorithms, cs.cc.complexity-theory, ds.dynamical-systems, oc.optimization-control, na.numerical-analysis, fourier-analysis

geometry: dg.differential-geometry, geometry, complex-geometry, mg.metric-geometry, riemannian-geometry, elliptic-curves, sg.symplectic-geometry, lie-groups, homological-algebra, lie-algebras, ct.category-theory

topology: at.algebraic-topology, gn.general-topology, gt.geometric-topology, homotopy-theory, cohomology, ct.category-theory

algebra: gr.group-theory, ra.rings-and-algebras, ac.commutative-algebra, rt.representation-theory, algebraic-groups, finite-groups, nt.number-theory, prime-numbers, ag.algebraic-geometry, arithmetic-geometry

analysis: ca.analysis-and-odes, fa.functional-analysis, cv.complex-variables, oa.operator-algebras, ap.analysis-of-pdes, calculus,differential-equations, measure-theory, ds.dynamical-systems

uncategorized: does not have one of these tags]]>