(评论)
(comments)

原始链接: https://news.ycombinator.com/item?id=43495811

Xorq是一个开源的、优先使用Python的库,旨在解决将Pandas风格的数据管道从研究部署到生产环境中的痛点。它基于Ibis和DataFusion构建,解决了SQL/Pandas不匹配、调试难题、重复计算和部署不可靠性等问题。其关键特性包括:基于Ibis的表达式系统,实现轻松的引擎流式处理;表达式缓存;支持Pandas DataFrame的DataFusion后端UDF引擎;YAML序列化;以及使用UDF轻松创建Flight端点。 创建者鼓励协作(Apache 2.0许可证)并欢迎反馈。安装方法:`pip install xorq` 或 `nix run github:xorq-labs/xorq`。演示包括MCP服务器+Flight+XGBoost、DuckDB并发示例和一个OpenAI UDF。团队随时解答您的问题。


原文
Hacker News new | past | comments | ask | show | jobs | submit login
Show HN: Xorq – open-source Python-first Pandas-style pipelines (github.com/xorq-labs)
8 points by secretasiandan 1 hour ago | hide | past | favorite | discuss
Hi HN, Dan, Hussain and Daniel here… After years of struggling with data pipelines that worked in notebooks but failed in production, we decided to do something about it. We created xorq to eliminate the constant headaches of SQL/pandas impedance mismatch, runtime debugging, wasteful recomputations and unreliable research-to-production deployments that plague traditional pandas-style pipeline workflows. xorq is built on Ibis and DataFusion.

We’d love your feedback and contributions. xorq is [Apache 2.0 licensed](https://github.com/letsql/xorq/blob/main/LICENSE) to encourage open collaboration.

Repo: https://github.com/letsql/xorq

Docs: https://docs.xorq.dev

Roadmap Issues: https://github.com/letsql/xorq

You can get started `pip install xorq`.

Or, if you use nix, you can simply run `nix run github:xorq-labs/xorq` and drop into an IPython shell.

Demo video: https://youtu.be/jUk8vrR6bCw

Here are some vignettes to look into next:

1. MCP Server + Flight + XGBoost: https://docs.xorq.dev/vignettes/mcp_flight_server

2. 1 DuckDB + 2 Writers + 1 Reader: https://docs.xorq.dev/vignettes/duckdb_concurrent

3. OpenAI UDF: https://docs.xorq.dev/tutorials/hn_data_prep

Some features to note:

- Ibis-based multi-engine expression system: effortless engine-to-engine streaming

- Cache expressions with `.cache` operator

- Portable DataFusion-backed UDF engine with first class support for pandas dataframes

- Serialize Expressions to and from YAML

- Easily build Flight end-points by composing UDFs

thanks for checking this out, and we’re here to answer any questions!











Join us for AI Startup School this June 16-17 in San Francisco!


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact



Search:
联系我们 contact @ memedata.com