TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

Show HN: Unofunction – an AWS Lambda that uses LibreOffice to convert documents

1 点作者 javidlakha大约 3 年前

1 comment

javidlakha大约 3 年前
*Motivation*<p>I wanted to learn about serverless computing and so I built Unofunction, a Python Lambda that wraps LibreOffice’s headless mode and can convert any document format that LibreOffice can import to any document format that LibreOffice can export (e.g. DOCX to PDF).<p>*Implementation*<p>Calling LibreOffice in a Python Lambda required compiling LibreOffice for the Amazon Linux 2 base image (CentOS 7). I originally tried compiling LibreOffice locally (using Docker) on my 2021 M1 MacBook Pro (10 CPU cores) but gave up after a few hours. On an AWS EC2 c6i.8xlarge instance (32 vCPUs, $1.36 per hour in eu-west-2), compiling took ~30 minutes. In case someone else might need a prebuilt headless LibreOffice that has been compiled for Amazon Linux 2, I’ve uploaded the resulting image to <a href="https:&#x2F;&#x2F;hub.docker.com&#x2F;repository&#x2F;docker&#x2F;unofunction&#x2F;libreoffice" rel="nofollow">https:&#x2F;&#x2F;hub.docker.com&#x2F;repository&#x2F;docker&#x2F;unofunction&#x2F;libreof...</a><p>A crucial implementation detail is that because AWS Lambda does not permit files to be written, except to the &#x2F;tmp directory, LibreOffice needs to be called with the argument &#x27;-env:UserInstallation=file:&#x2F;&#x2F;&#x2F;tmp&#x2F;’. Otherwise, it will attempt user installation in .config&#x2F;libreoffice and fail.<p>*Thoughts*<p>I enjoyed using the AWS CDK a lot. There is something magical about being able to deploy infrastructure by manipulating it as if it were a Python or a TypeScript object. I preferred it to Terraform’s declarative approach.[1] My only criticisms are that deployment can be extremely slow and that destroying S3 buckets can be clunky (they need to be emptied first).<p>Testing code deployed to AWS is hard. Neither AWS SAM nor LocalStack perfectly simulates a real deployment: code which works locally does not always work on Lambda. (An example of this is deploying files to the &#x2F;tmp directory. SAM allows such files to be edited or overwritten; Lambda makes them read-only.)<p>[1] This article is a good comparison between Terraform and the AWS CDK: <a href="https:&#x2F;&#x2F;medium.com&#x2F;swlh&#x2F;cdk-or-terraform-88a464bedf9e" rel="nofollow">https:&#x2F;&#x2F;medium.com&#x2F;swlh&#x2F;cdk-or-terraform-88a464bedf9e</a>