使用 Quarto 撰写数据科学书籍(使用 Jupyter Notebooks 或 Pandoc)
Writing a book with Quarto

原始链接: https://blog.stephenturner.us/p/quarto-books

为了公开学习和探索超越传统出版的途径,作者将一个使用RMarkdown构建的旧课程网站迁移到了Quarto——下一代科学出版系统,支持Python、R和OJS。Quarto允许从单个文档创建多样化的输出(HTML、PDF、书籍、网站等),甚至嵌入交互式代码。 该项目涉及将最初在UVA教授的“R语言生物数据科学”研究生课程转换为一本精美的电子书。值得注意的是,现有的RMarkdown源代码只需进行少量调整即可在Quarto中无缝运行。主要更改包括更新文件引用和术语以适应书籍格式。 最终的书籍托管在GitHub Pages(bdsr.stephenturner.us)上,并自动生成PDF和EPUB版本。虽然课程材料可以追溯到2015-2018年,反映了旧版本的软件包,但作者强调了Quarto在现代出版方面的潜力,并指出了Quarto Books、Manuscripts和Dashboards等资源以供进一步探索。

## Quarto 用于书籍和幻灯片制作:优缺点并存 一篇 Hacker News 讨论围绕着使用 Quarto (和 reveal.js) 来制作书籍和演示文稿。Quarto 在版本控制、Python/代码嵌入和数学排版等功能方面表现出色,但用户体验各不相同。 一些评论者认为 Quarto/reveal.js 在制作详细的讲座幻灯片时令人沮丧,因为它在图像处理、精确定位和动画方面存在困难——这些任务在 PowerPoint 中可以轻松处理。他们报告说花费了大量时间在 HTML/Javascript 变通方法上。 然而,其他用户则获得了积极的体验,特别是主要使用文本、公式和图表的经济学家。Quarto 能够从单个源文件生成论文*和*幻灯片是一个主要优势。 一位用户成功地使用 Quarto 自行出版了一本书,并欣赏它在为不同输出类型(ePub 与 PDF)进行格式化方面的灵活性,甚至包括自动翻译功能。最终,该工具的适用性很大程度上取决于所需输出的复杂程度以及个人的工作流程偏好。
相关文章

原文

In the spirit of learning in public, I wanted an excuse to dive into Quarto to learn more about publishing formats beyond simple PDF and HTML documents.

If you’re not familiar, Quarto (quarto.org) is the successor to RMarkdown, the next-generation scientific publishing system that works natively with Python, R, and OJS. If you already have RMarkdown you probably don’t have to do anything to it to get it to render with Quarto. The wonderful thing about Quarto (and to a lesser extent, RMarkdown) is that you can write one single input document and render many types of output documents — HTML, PDF, Word docs, presentations (Powerpoint, Beamer, RevealJS), dashboards, websites, books, blogs, and more). And, Quarto Live, you can embed WebAssembly-powered interactive code blocks for R and Python right into a Quarto document (example here).

I demonstrate here how I turned an old course website of mine made from a bunch of RMarkdown documents into a polished e-book using Quarto. I also briefly point out Quarto Manuscripts and Quarto Dashboards at the end.

You can read the book or download a PDF at https://bdsr.stephenturner.us/.

Biological Data Science with R

Back when I was faculty at UVA I started a series of workshops in response to the growing demand for practical education in data science and bioinformatics that the traditional coursework at the time lacked. I eventually turned this into a graduate course, and later into a course directed to faculty seeking a career in translational science. The course was a Software Carpentry style live coding hands-on course, mostly using R, that covered topics including data manipulation with dplyr, visualization with ggplot2, predictive modeling with caret, text mining with tidytext, RNA-seq analysis with DESeq2, basic statistics, survival analysis, and other topics.

I made the course website using RMarkdown Websites — a feature that I don’t think ever got much traction, but I found incredibly useful. You put a _site.yml file in the root of your project, and you got a little “Build Website” button in the RStudio build pane. Hit that button and it would render all the RMarkdown documents in the project, and give you a website with pages listed as they are in the _site.yml file. I took a lot of inspiration from Jenny Bryan’s old STAT545 course, and borrowed teaching ideas from other courses and blog posts around the internet.

This actually worked fairly well! My old workshop and course material website is still alive at stephenturner.github.io/workshops (screenshot below), and the code is all open on GitHub (github.com/stephenturner/workshops)

—Quarto has entered the chat—

Quarto showed up on my radar in 2022 at the last rstudio::conf (before it became posit::conf), where Posit announced the name change, public benefit corp status, Shiny for Python, and, Quarto. I’ve slowly switched most of my technical authoring from RMarkdown to Quarto. I had always been a big fan of the rticles package, and now Quarto is starting to catch up with journal article templates (I wrote the biorecap paper using a generic arXiv Quarto template).

I have bookmarks to so many great reference books including R for Data Science, Hands on Programming with R, and Python for Data Analysis, all of which are written as Quarto books.

The docs (quarto.org/docs/books) looked fairly simple. Just stick a bunch of qmd files in a directory and reference them in a _quarto.yml file. I wanted an excuse to explore the book authoring experience with Quarto, so I grabbed all the source code from my old course website to give it a try.

I intended to write this as a short tutorial, but there’s no tutorial here, because I really didn’t have to do anything! All the RMarkdown source from my old website just worked. There were a few slight customizations I made to the _quarto.yml from my old _site.yml. I updated a little bit of the verbiage to refer to book Chapters instead of lessons, and I updated a few places to use cross-references instead of hard-coded hyperlinks. I followed the GitHub pages publishing docs to configure my gh-pages branch to serve up the content I push to the main branch after running quarto publish gh-pages. And I stuck a custom subdomain on the repo so I could serve at bdsr.stephenturner.us instead of the default stephenturner.github.io/bdsr. The process took me about an hour or so. The book website is live at bdsr.stephenturner.us, and the publish command automatically makes both PDF and EPUB versions available.

Biological Data Science with R

The book is based on course material I developed around 2015 and taught through 2018, so it’s starting show its age. There are some dplyr functions that are deprecated or superseded, it uses gather() and spread() from tidyr instead of pivot_*(), and it’s using caret and related packages instead of tidymodels. And, in the predictive modeling and forecasting chapter, there’s a section on forecasting influenza-like illness that shows perfectly regular seasonal ILI patterns through 2019-2021, a counterfactual that co-instructor and co-author VP (Pete) Nagraj and I published a paper on years later.

I recently wrote a short essay about learning in public:

In that spirit I wanted to share a few resources related to books and other Quarto topics I’ve been reading.

The Quarto Books documentation is a well-organized place to start to get more info on publishing a book with Quarto, and you can find other great examples in the gallery. I also wanted call out two relatively new Quarto output document types that are worth looking at.

Mine Cetinkaya-Rundel, Professor of the Practice of Statistical Science at Duke University, gave a great talk about Quarto Manuscripts at the R/Medicine conference earlier this year. The talk was awesome. With Quarto Manuscripts you can write a narrative and include additional R/Python/Etc notebooks alongside the manuscript, and render the output in multiple formats. See the example here, and Mine’s talk below.

I’ve used and recommended flexdashboard for making static or Shiny-enabled dashboards using RMarkdown. When Quarto first launched, dashboards were missing for at least a year. Quarto 1.4 was released earlier this year, introducing Quarto Dashboards along with other new features. See the short video below from Posit.

联系我们 contact @ memedata.com