Not signed in (Sign In)

Vanilla 1.1.9 is a product of Lussumo. More Information: Documentation, Community Support.


    I am just wondering: is it me or has there recently been a major (quantitative) decline in combinatorial and combinatorial-algebraic (representation theory, Hopf algebras, invariant theory, Lie algebras, even advanced linear algebra...) questions on MO? Is it that some people have left MO, or are busy, or just have got all their nagging questions answered and now can sleep well?


    It would be interesting to see how tag frequencies vary over time. I searched a bit to see if this has already been culled from the dumps, and did not find a direct hit. Here are three related discussions, the closest I could find:

    • CommentAuthortheojf
    • CommentTimeNov 29th 2011
    Well, I've been asking fewer Hopf-algebraic questions than I might have in the past, since it seems that right now what I really need to learn is rational homotopy theory. (Not that these aren't related.) I wish that there were more advanced linear algebra questions, but I've been assuming they have drifted over the math.SE.
    This doesn't directly answer your question, but I made a really crude graph. The graph shows that the tags are roughly stable in frequency over time. I grouped the top 50 tags into 8 categories and plotted the number of posts with at least one of these tags over time. I welcome any suggestions to improve the method (especially clustering tags more sensibly). Here is the graph:

    Here are my categories:

    combinatorics & discrete math: co.combinatorics, graph-theory

    foundations: set-theory, lo.logic, ct.category-theory

    statistics & probability: pr.probability, st.statistics, measure-theory

    numerical analysis & applied math: linear-algebra, mp.mathematical-physics, algorithms,, ds.dynamical-systems, oc.optimization-control, na.numerical-analysis, fourier-analysis

    geometry: dg.differential-geometry, geometry, complex-geometry, mg.metric-geometry, riemannian-geometry, elliptic-curves, sg.symplectic-geometry, lie-groups, homological-algebra, lie-algebras, ct.category-theory

    topology: at.algebraic-topology, gn.general-topology, gt.geometric-topology, homotopy-theory, cohomology, ct.category-theory

    algebra:, ra.rings-and-algebras, ac.commutative-algebra, rt.representation-theory, algebraic-groups, finite-groups, nt.number-theory, prime-numbers, ag.algebraic-geometry, arithmetic-geometry

    analysis: ca.analysis-and-odes, fa.functional-analysis, cv.complex-variables, oa.operator-algebras, ap.analysis-of-pdes, calculus,differential-equations, measure-theory, ds.dynamical-systems

    uncategorized: does not have one of these tags

    Very nice, Yla! Aside from the rough stability over time, as you note, two features surprised me:

    • The category Combinatorics & Discrete Math shows more than twice the posts of any other category. (But see quid's correction later: I misinterpreted the colors!)
    • There is a distinct downward trend in all categories in the last six months or so.

    My guess is the latter is due to the influence of MSE. Perhaps the activity level in MO overall is decreasing?

    I do think the categories could group the tags differently. For example, number theory should likely be separated from algebra. But regardless, very interesting data!

    • CommentAuthorquid
    • CommentTimeNov 30th 2011

    @Joseph O'Rourke: I think the top category is Algebra (the colors are similar). The total tag count of co.combinatorics plus graph-theory is smaller than that of nt.number-theory and ag.algebriac-geometry individually(!), both in Algebra. And then there is almost as large as co in Algebra too.

    • CommentAuthorJDH
    • CommentTimeNov 30th 2011
    Great data, Yla!

    I noticed that you have ct.category-theory in three of your categories; perhaps it would make sense for it to be its own category?

    For your foundations grouping, I would suggest:

    Foundations: lo.logic, set-theory, model-theory, computability-theory, proof-theory,, forcing, order-theory, axiom-of-choice, large-cardinals, math-philosophy, ultrafilters, boolean-algebra, metamathematics, foundations, nonstandard-analysis, but probably several of these are not in the top 50 tags.

    Perhaps the methodology of using the top 50 tags is not sound, in that some subjects have dispersed themselves more finely into many tags, and so such subjects would be under-represented in your data.

    @quid: Oh, I think I have just learned I am partially color blind! Thanks for clarifying. That makes much more sense!


    @Joseph: I think you were fooled (as I also initially was) by the fact that Combinatorics is above Algebra in the key, but the relationship in the graph is the opposite.

    I think the biggest problem is double counting of posts. One post being tagged twice is arguable not the same in terms of activity as two posts with one tag each.
    • CommentAuthorquid
    • CommentTimeNov 30th 2011

    I agree with Michael Greinecker. In particular if one post has two tags from the same category, say, nt and prime-numbers.


    Thanks for the statistics, Yla! So it seems that all activity on MO is decreasing, not just the one I care about. I wouldn't blame this completely on M.SE, since not many people would post a Hopf algebra or representation theory question (beyound "please clarify this definition" or "what textbooks to read") on M.SE. Maybe the activity we had here a year ago was anomally high, due to newcomers posting all of their unanswered questions that arose during their previous work history?


    @darijgrinberg: I only found two MSE posts on Hopf algebras. There are more (233) tagged representation-theory representation theory questions, but I am in no position to judge their depth (or whether they are correctly tagged).


    Oh, for some reason I searched for the words rather than the tag...

    Still, most of the representation theory questions on M.SE are undergraduate level. I can't say there was much migration from MO to M.SE here.

    The seeming decline, is a mistake in how I was plotting the data. I binned the data into 91 day increments ie roughly 3 months. The last bin has slightly fewer days because I am using the dump from 9/1/11. I've fixed that by scaling the last bin to be as if it were 91 days. Even with scaling it is slightly lower than the previous months, my guess is that the difference is not a meaningful difference. The data looks like the number of posts is leveling out rather than declining.

    As far as double counting. I am not double counting a post if it has two tags from the same category (e.g. if a post has 'set-theory' and 'logic' then foundations gets increased by 1 post). However I am allowing a post to count in two areas (e.g. if a post has 'logic' and 'algorithms' it will be counted for foundations and applied). Conceptually I think this makes sense because a post can legitimately be about two areas. I now have category theory only in foundations (not really sure where to put it but I am not sure it deserves it's own category).

    As JDH suggests I added the suggested tags to foundations. This increased the counts by a factor of ~1.7. I don't think limiting the tags to the top 50 is a great method. I wish that there were a good method for automatically clustering tags. I haven't tried any clustering algorithms, but my guess is that the overlap in the areas and the meaninglessness of some tags with respect to area (e.g. reference-request) might make such an algorithm difficult.
    I forgot, I also took number theory out of algebra

    @Yla: Thanks for reworking the data! Still very interesting...


    Actually, I did some tag clustering at the end of the summer in the hope of detecting which arxiv tags were systematically underused. (lo.logic was the only really bad case.) Unfortunately, the data just suffered a major incident, but it will be recovered from backup drives within a couple of days. I will post the details as soon as I regain access to them.


    Is it possible to add the tag name by the graph itself? I am a color blind and it is nearly impossible for me to match the graph to the tags family.

    Thanks! :-)


    I would second the suggestion for separating out category theory from foundations. They do overlap in practice, but on MO one can tag a post ct.category-theory and set-theory and forcing (for instance) that falls in the overlap, and tag a post ct.category-theory and schemes and sheaves if it overlaps with algebraic geometry and so on.

    Thanks, Yla, for these graphs! (talk to you soon :)


    On a graphical design note, changing the lines with similar colours to be dashed, dotted, solid as necessary will help a lot. And I suppose 'similar' might also need to include red-green and other combinations indistinguishable to colour blind people. Alternatively, don't use green at all.


    Personally the red-green issue is a bit less of a problem for me at most of the time, my color blindness is completely atypical and it is pink that I cannot see well. This makes many shades of similar colors (and less similar) somewhat hard to distinguish.

    Combinatorics and Algebra seem to be the exact same color, so do applied and statistics, geometry and number theory while different are too far from the graph itself and I cannot tell which line is which tag.

    David Roberts suggestion to use dashed and dotted lines as well is a wonderful idea.

    Here are the results of the tag analysis I did this summer. These show the typical tag composition for each arxiv category. (The numbers are not frequencies, but higher numbers do indicate higher frequencies.) This data was generated using the August dump, so it is slightly dated. Moreover, some big tag renaming and mergers were done as a result of this analysis (e.g. 'optimization' was merged with 'oc.optimization-control'). However, the clustering is fairly accurate for the major arxiv categories.

    This data was generated using an experimental algorithm inspired by k-means clustering, but designed specifically for the task of detecting posts within an arxiv category but without an arxiv tag. Parameters were chosen to be very strict so that only posts that are clearly within an arxiv category would be detected. I think looser parameters would be more useful for Yla's purposes, but this is the only data I currently have. I might try to repeat the experiment later this month.

    ag.algebraic-geometry (2107, 0)
    0.989 ag.algebraic-geometry
    0.063 reference-request

    at.algebraic-topology (846, 46)
    0.932 at.algebraic-topology
    0.306 homotopy-theory
    0.101 ag.algebraic-geometry
    0.076 cohomology
    0.056 simplicial-stuff
    0.054 model-categories
    0.052 homological-algebra

    ap.analysis-of-pdes (220, 98)
    0.731 differential-equations
    0.651 ap.analysis-of-pdes
    0.128 analysis
    0.091 elliptic-pde
    0.080 reference-request

    ct.category-theory (836, 0)
    0.981 ct.category-theory
    0.084 ag.algebraic-geometry
    0.075 higher-category-theory
    0.053 topoi
    0.050 at.algebraic-topology

    ca.classical-analysis (356, 0)
    0.971 ca.classical-analysis
    0.109 fa.functional-analysis
    0.090 analysis
    0.076 fourier-analysis
    0.068 real-analysis
    0.068 calculus
    0.052 cv.complex-variables
    0.052 ap.analysis-of-pdes

    co.combinatorics (932, 0)
    0.958 co.combinatorics
    0.226 graph-theory
    0.111 nt.number-theory
    0.065 reference-request
    0.050 algorithms

    ac.commutative-algebra (702, 0)
    0.921 ac.commutative-algebra
    0.374 ag.algebraic-geometry
    0.063 homological-algebra

    cv.complex-variables (329, 110)
    0.829 cv.complex-variables
    0.356 complex-geometry
    0.318 complex-analysis
    0.250 ag.algebraic-geometry
    0.064 dg.differential-geometry
    0.061 riemann-surfaces
    0.057 reference-request
    0.053 analysis

    dg.differential-geometry (779, 37)
    0.957 dg.differential-geometry
    0.190 riemannian-geometry
    0.139 ag.algebraic-geometry
    0.074 complex-geometry
    0.067 at.algebraic-topology
    0.061 lie-groups
    0.059 reference-request

    ds.dynamical-systems (227, 12)
    0.946 ds.dynamical-systems
    0.260 ergodic-theory
    0.106 reference-request
    0.053 classical-mechanics
    0.053 dg.differential-geometry

    fa.functional-analysis (478, 26)
    0.950 fa.functional-analysis
    0.231 banach-spaces
    0.095 reference-request
    0.084 measure-theory
    0.074 fourier-analysis
    0.067 analysis
    0.065 pr.probability
    0.055 hilbert-space
    gm.general-mathematics (50, 0)
    0.976 gm.general-mathematics
    0.117 big-list
    0.098 soft-question
    0.059 lo.logic
    0.059 career

    gn.general-topology (499, 0)
    0.983 gn.general-topology
    0.095 at.algebraic-topology
    0.061 ct.category-theory

    gt.geometric-topology (452, 50)
    0.897 gt.geometric-topology
    0.321 at.algebraic-topology
    0.234 knot-theory
    0.096 3-manifolds
    0.087 dg.differential-geometry
    0.058 reference-request (915, 30)
    0.159 finite-groups

    ho.history-overview (236, 0)
    0.948 ho.history-overview
    0.181 soft-question
    0.165 nt.number-theory
    0.117 reference-request
    0.104 big-list

    it.information-theory (68, 0)
    0.910 it.information-theory
    0.295 pr.probability
    0.174 st.statistics
    0.134 co.combinatorics
    0.067 measure-theory
    0.067 coding-theory
    0.054 linear-algebra
    0.054 reference-request
    0.054 mp.mathematical-physics
    0.054 lo.logic
    0.054 computer-science

    kt.k-theory-homology (102, 0)
    0.896 kt.k-theory-homology
    0.334 at.algebraic-topology
    0.176 ag.algebraic-geometry
    0.114 homotopy-theory
    0.088 algebraic-k-theory
    0.088 reference-request
    0.070 triangulated-categories
    0.053 nt.number-theory

    lo.logic (949, 350)
    0.718 lo.logic
    0.662 set-theory
    0.128 model-theory
    0.073 computability-theory
    0.061 forcing
    0.052 math-philosophy
    0.052 proof-theory
    0.050 axiom-of-choice

    mp.mathematical-physics (238, 0)
    0.960 mp.mathematical-physics
    0.145 dg.differential-geometry
    0.105 ag.algebraic-geometry
    0.073 differential-equations
    0.065 string-theory
    0.065 ap.analysis-of-pdes
    0.056 reference-request
    0.056 quantum-field-theory
    0.052 soft-question

    mg.metric-geometry (252, 0)
    0.962 mg.metric-geometry
    0.137 geometry
    0.122 dg.differential-geometry
    0.107 reference-request
    0.057 gn.general-topology

    nt.number-theory (1870, 0)
    0.977 nt.number-theory
    0.123 ag.algebraic-geometry
    0.079 prime-numbers
    0.056 arithmetic-geometry
    0.054 elliptic-curves
    0.052 analytic-number-theory

    na.numerical-analysis (194, 0)
    0.958 na.numerical-analysis
    0.193 linear-algebra
    0.128 matrices
    0.079 algorithms
    0.059 st.statistics

    oa.operator-algebras (224, 20)
    0.870 oa.operator-algebras
    0.320 von-neumann-algebras
    0.260 fa.functional-analysis
    0.188 c-star-algebras
    0.115 reference-request
    0.085 subfactors
    0.064 noncommutative-geometry

    oc.optimization-control (142, 98)
    0.899 optimization
    0.350 oc.optimization-control
    0.143 global-optimization
    0.135 linear-algebra
    0.064 convexity
    0.064 matrices
    0.056 algorithms

    pr.probability (699, 19)
    0.975 pr.probability
    0.138 stochastic-processes
    0.093 co.combinatorics
    0.066 stochastic-calculus
    0.062 markov-chains

    qa.quantum-algebra (166, 53)
    0.727 qa.quantum-algebra
    0.514 quantum-group
    0.347 hopf-algebras
    0.219 noncommutative-geometry
    0.084 rt.representation-theory
    0.064 monoidal-categories
    0.064 tqft
    0.058 lie-algebras
    0.051 reference-request
    0.051 fusion-categories

    rt.representation-theory (763, 139)
    0.911 rt.representation-theory
    0.250 lie-algebras
    0.218 lie-groups
    0.119 reference-request
    0.118 ag.algebraic-geometry
    0.092 algebraic-groups

    ra.rings-and-algebras (369, 0)
    0.970 ra.rings-and-algebras
    0.142 ac.commutative-algebra
    0.087 linear-algebra
    0.074 noncommutative-algebra
    0.074 rt.representation-theory

    sp.spectral-theory (68, 0)
    0.914 sp.spectral-theory
    0.269 fa.functional-analysis
    0.134 dg.differential-geometry
    0.108 linear-algebra
    0.108 matrices
    0.081 mp.mathematical-physics
    0.067 laplacian
    0.054 oa.operator-algebras
    0.054 ds.dynamical-systems
    0.054 rt.representation-theory
    0.054 differential-operators
    0.054 hilbert-space
    0.054 eigenvector

    st.statistics (386, 0)
    0.920 st.statistics
    0.381 pr.probability
    0.050 probability-distributions

    sg.symplectic-geometry (167, 0)
    0.927 sg.symplectic-geometry
    0.261 ag.algebraic-geometry
    0.155 dg.differential-geometry
    0.105 complex-geometry
    0.083 symplectic-topology
    0.078 gromov-witten-theory
    0.061 mp.mathematical-physics
    @François G. Dorais this is great! I will try looking at the data with these new categories when I have some free time. (And try some better color/stroke labeling schemes.) I am also curious to know more about your algorithm if you would be willing to share.

    Yla, I will email you this weekend with details.