TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

Lesser-known Pandas tricks (2019)

159 点作者 HIP_HOP大约 5 年前

7 条评论

jpxw大约 5 年前
Something I love about pandas is that often you can pass a URL in place of a file name.<p>The other day I needed to scrape data from a table on a webpage. Thinking about traversing the DOM and building up an array was already giving me a headache. Thankfully pandas has the “read_html” function. Getting a list of dataframes for each table on the page was as easy as:<p><pre><code> dfs = pd.read_html(url)</code></pre>
评论 #22544896 未加载
评论 #22546376 未加载
评论 #22544916 未加载
aksakalli大约 5 年前
Medium wants me to upgrade my account to read this article, please people share your posts in somewhere else.
评论 #22544946 未加载
评论 #22544961 未加载
评论 #22544912 未加载
andreareina大约 5 年前
Merge with indicator is also useful for doing anti-joins:<p><pre><code> left.merge(right, how=&quot;left&quot;, indicator=True, ...) [lambda df: df._merge == &quot;left_only&quot;]</code></pre>
staticautomatic大约 5 年前
My favorite, most elegant SO answer I&#x27;ve ever gotten was to a question about Pandas.<p>The question was &quot;How do I create a column where each row&#x27;s value is the mean of another column&#x27;s values starting at that row?&quot; The answer was:<p><pre><code> df.loc[::-1, &#x27;col_1&#x27;].expanding().mean()[::-1]</code></pre>
评论 #22550193 未加载
评论 #22552058 未加载
closed大约 5 年前
Note that there is a handy PeriodIndex version of pd.date_range:<p><pre><code> pd.period_range(date_from, date_to, freq = &quot;D&quot;) </code></pre> AFAICT, a PeriodIndex and DateTimeIndex function mostly the same, and have many of the same methods, except...<p><pre><code> * DateTimeIndex can&#x27;t hold dates far in the future * PeriodIndex can&#x27;t easily round to the end of a period (e.g. date + 0*MonthEnd() errors) * PeriodIndex doesn&#x27;t handle timezones?</code></pre>
HIP_HOP大约 5 年前
TLDR;<p>5 lesser-known pandas tricks:<p>1. Date Ranges<p>2. Merge with indicator<p>3. Nearest merge by timestamp<p>4. Create an Excel report from pandas<p>5. Use gzip with when saving to csv
collyw大约 5 年前
Does anyone want to do a TLDR? I don&#x27;t especially want to sign into Medium.
评论 #22544910 未加载