LaTeXML 0.8.7 in converting arXiv (08.2023)
Baseline package support
Total article sources converted: 2,086,032
- 0 unsupported LaTeX packages were used in more than 100,000 articles.
- 7 unsupported LaTeX packages were used in more than 10,000 articles.
Namely: epic.sty, tikz-cd.sty, biblatex.sty, arydshln.sty, mdframed.sty, mhchem.sty, tabu.sty
- 183 unsupported LaTeX packages were used in more than 1,000 articles.
- 802 unsupported LaTeX packages were used in more than 100 articles.
- 2,260 unsupported LaTeX packages were used in more than 10 articles.
- 11,046 unsupported LaTeX packages were used in at least 1 article.
These statistics can be additionally noisy due to other errors during conversion, as well as
the specific latexml configuration. But the general magnitudes should be reliable. Also note that a "LaTeX package" here does not mean a CTAN package: custom author ".sty" files included in the submission bundles are also counted.
This report was extracted from the build system details at:
https://corpora.mathweb.org/corpus/arxmliv/tex%5Fto%5Fhtml/warning/missing%5Ffile?all=true
New styling experiment:
ar5iv now uses "text-wrap: balance" #CSS for headings, which kicks in on small screens.
Here's before/after in Chrome 📸
🗓️ The May 2023 arXiv articles are now in ar5iv.
May was a record month for arXiv with over 20,000 submissions in a single month. 92% of those had a TeX source, which allowed the ar5iv collection to finally exceed 2 million articles!
We are now at 2,004,153 viewable HTML pages.
🗓️ The April 2023 arXiv articles are now in ar5iv.
And in case you missed arXiv's accessibility forum last month, the keynote talks have been made available online:
https://www.youtube.com/playlist?list=PLYgeAMJvRZ6axwfwqkcjMaq0N80mtwBaq
🗓️ The March 2023 arXiv articles are now in #ar5iv.
In related news: arXiv is hosting an online accessibility forum on the 17th:
https://info.arxiv.org/about/accessibility_forum.html
May be related? 🤔
Register (free) and find out.
@b3nb3n I'd seen that "ar5iv" trick posted before, and tried it a few times, but it never worked. I guess because I'm usually clicking links to brand new papers that haven't been ported to HTML5 yet. If they want it to catch on, they need to post the '5' version along with the regular 'x' page.
One more whine... Changing one char in an iPhone Safari link is maddeningly difficult! Arxiv has a Format Selector that allows saving cookies to set your choice - they need to add HTML5 there! #ar5iv
@b3nb3n Tranquility shows all the equations in the same bright display style as the text. Strangely, JustRead shows a few of the larger ones less bright with a green background - no clue what that means!
In iOS, Safari reader mode works as poorly for me as it always does. The only white on black setting dims the white way down - but leaves the borders and decorations glaring bright! To get high contrast you have to select b on w and then go invert the entire display...
#Accessibility #ar5iv
This February, we passed 2 million usable TeX sources in arXiv.
And you can preview 97.6% of them on the web.
The 2302 articles went live tonight.
https://ar5iv.labs.arxiv.org/
🗓️ The January 2023 arXiv articles are now in #ar5iv.
A month at a time, we've now marked a year of updates since ar5iv went live.
Indeed, #ar5iv is an ongoing experiment under the arXiv Labs umbrella, and we (the LaTeXML team) are very busily improving it, until it reaches the point where everyone agrees it is worth becoming official.
LaTeXML 0.8.7 was just released!
We're ready for MathML Core, which is expected to be available in all major browsers, early in 2023.
With gratitude to the wider academic community, who helped drive another productive year of extending our TeX interpretation fidelity and our LaTeX ecosystem coverage.
Full release notes at:
https://github.com/brucemiller/LaTeXML/releases/tag/v0.8.7
"Based on our user research the step our community wants arXiv to take is clear: offer well formatted, accessible HTML alongside existing sources."
The #arXiv corpus is vast. It has more content (and #LaTeX programming) than a person can read in a lifetime. Naturally, when converted to HTML, the long-tail of issues is often mysterious and unexplored.
Help us out:
At the bottom of each page in #ar5iv, there is a "report an issue" button. Use it to file a quick issue whenever you see flaws of any kind. Your report could open the path to fixing hundreds if not thousands of related articles.
#introduction Hi everyone, Deyan here!
I tend to discuss the journey of converting #arXiv into #ar5iv: an HTML5 preview site for the world's largest preprint server.
I'm helping to develop the next generation of #latexml and #mathml, focusing on the most idiosyncratic corners of #LaTeX and math syntax.
And you'll see the occasional AI art / Large language model experiment flying by as well...
#latex #mathml #latexml #ar5iv #arxiv #introduction