原文:[The boring technology behind a one-person Internet company](https://www.listennotes.com/blog/the-boring-technology-behind-a-one-person-23/) ##### Overview 概述 让我们从这个Listen Notes项目的要求或特点开始。 > Let’s start with requirements or features of this Listen Notes project. Listen Notes为用户提供两样东西。 > Listen Notes provides two things to end users: - 一个为播客听众服务的网站ListenNotes.com。它提供了一个搜索引擎,一个播客数据库,Listen Later播放列表,Listen Clips(允许你剪切任何播客集的片段),以及Listen Alerts(当互联网上的新播客中提到指定的关键词时,通知你)。 > A website [ListenNotes.com](https://www.listennotes.com/) for podcast listeners. It provides a search engine, a podcast database, [Listen Later](https://www.listennotes.com/listen/?s=nav) playlists, [Listen Clips](https://www.listennotes.com/clips/?s=nav) that allows you to cut a segment of any podcast episode, and [Listen Alerts](https://www.listennotes.com/alerts) that notifies you when a specified keyword is mentioned in new podcasts on the Internet. - 面向开发者的播客搜索和目录API。我们需要跟踪API的使用情况,从付费用户那里获得资金,做客户支持,以及更多。 > [Podcast Search & Directory APIs](https://www.listennotes.com/api/) for developers. We need to track the API usage, get money from paid users, do customer support, and more. I run everything on AWS. There are 20 production servers (as of May 5, 2019): 你可以很容易地从主机名猜出每个服务器是做什么的。 - **production-web** serves web traffics for [ListenNotes.com](https://www.listennotes.com/). - **production-api** serves api traffics. We run two versions of API (as of May 4, 2019), thus v1api (the legacy version) and v2api (the new version). - **production-db** runs PostgreSQL (master & slave) - **production-es** runs an Elasticsearch cluster. - **production-worker** runs offline processing tasks to keep the podcast database always up-to-date and to provide some magical things (e.g., search result ranking, episode/podcast recommendations…). - **production-lb** is the load balancer. I also run Redis & RabbitMQ on this server, for convenience. I know this is not ideal. But I’m not a perfect person :) - **production-pangu** is the production-like server that I sometimes run one-off scripts and test changes. What’s the meaning of “[pangu](https://en.wikipedia.org/wiki/Pangu)”? ##### Backend 后台整个后台是用Django / Python3编写的。选择的操作系统是Ubuntu。 我使用uWSGI来服务网络流量。我把NGINX放在uWSGI进程的前面,它也作为负载平衡器。 主要的数据存储是PostgreSQL,我已经有了很多年的开发和操作经验--经过测试的技术是好的,所以我可以在晚上睡得很好。Redis被用于各种用途(例如,缓存、统计......)。不难猜测,Elasticsearch也被用在某个地方。是的,我使用Elasticsearch来索引播客和剧集,并为搜索查询提供服务,就像大多数无聊的公司一样。 Celery用于离线处理。而Celery Beat是用来调度任务的,这就像Cron作业一样,但要好一点。如果将来Listen Notes获得了牵引力,而Celery & Beat造成了一些扩展问题,我可能会转到我为前雇主做的两个项目:ndkale和ndscheduler。 Supervisord用于每台服务器的进程管理。 等等,Docker / Kubernetes / serverless怎么样?不,不。随着经验的积累,你会知道什么时候不应该过度工程化。实际上,早在2014年,我就为我的前雇主做了一些早期的Docker工作,这对一个中等规模的十亿美元的创业公司来说是很好的,但对一个人的小创业公司来说可能是过度的。 > The entire backend is written in Django / Python3. The operating system of choice is Ubuntu. > > I use [uWSGI](https://uwsgi-docs.readthedocs.io/en/latest/) to serve web traffics. I put [NGINX](https://www.nginx.com/) in front of uWSGI processes, which also serves as load balancer. > > The main data store is [PostgreSQL](https://www.postgresql.org/), which I’ve got a lot of development & operational experience over many years — battle tested technology is good, so I can sleep well at night. [Redis](https://redis.io/) is used for various purposes (e.g., caching, stats,…). It’s not hard to guess that [Elasticsearch](https://www.elastic.co/) is used somewhere. Yes, I use Elasticsearch to index podcasts & episodes and to serve search queries, just like [most](https://medium.com/netflix-techblog/tagged/elasticsearch) [boring](https://engineeringblog.yelp.com/2017/06/moving-yelps-core-business-search-to-elasticsearch.html) [companies](https://eng.uber.com/tag/elasticsearch/). > > [Celery](http://www.celeryproject.org/) is used for offline processing. And [Celery Beat](http://docs.celeryproject.org/en/latest/userguide/periodic-tasks.html) is for scheduling tasks, which is like Cron jobs but a bit nicer. If in the future Listen Notes gains traction and Celery & Beat cause some scaling issues, I probably will switch to the two projects I did for my previous employer: [ndkale](https://github.com/Nextdoor/ndkale) and [ndscheduler](https://github.com/Nextdoor/ndscheduler). > > [Supervisord](http://supervisord.org/) is used for process management on every server. > > Wait, how about Docker / Kubernetes / serverless? Nope. As you gain experience, you know when not to over-engineer. I actually did some early Docker work for my previous employer back in 2014, which was good for a mid-sized billion-dollar startup but may be overkill for a one-person tiny startup. ##### DevOps ##### Machine provisioning & code deployment 机器配置和代码部署 我使用Ansible进行机器配置。基本上,我写了一堆yaml文件来指定什么类型的服务器需要有什么配置文件和什么软件。我只需按一个按钮,就能让一台服务器安装上所有正确的配置文件和软件。这是这些Ansible yaml文件的目录结构。 > I use [Ansible](http://docs.ansible.com/) for machine provisioning. Basically, I wrote a bunch of yaml files to specify what type of servers need to have what configuration files & what software. I can spin up a server with all correct configuration files & all software installed with one button push. This is the directory structure of those Ansible yaml files: 我还使用Ansible来部署代码到生产中。基本上,我有一个包装脚本deploy.sh,在macOS上运行。 > I also use Ansible to deploy code to production. Basically, I have a wrapper script *deploy.sh* that is run on macOS: *./deploy.sh production HEAD web* deploy.sh脚本需要三个参数。 > The deploy.sh script takes three arguments: - **Environment**: production or staging. - **Version of the listennotes repo**: HEAD means “just deploy the latest version”. If a SHA of a git commit is specified, then it’ll deploy a specific version of code — this is particularly useful when I need to rollback from a bad deployment. listennotes repo的版本。 HEAD意味着 "只需部署最新版本"。如果指定了git提交的SHA,那么它将部署一个特定版本的代码--这在我需要从一个糟糕的部署中回滚时特别有用。 - **What kind of servers**: web, worker, api, or all. I don’t have to deploy to all servers all at once. Sometimes I make changes on Javascript code, then I just need to deploy to web, without touching api or worker. 什么样的服务器:web、worker、api,还是所有。我不需要一下子就部署到所有的服务器。有时我对Javascript代码进行了修改,那么我只需要部署到Web上,而不需要接触到api或worker。 部署过程主要是由Ansible的yaml文件协调的,当然,这也是非常简单的。 > The deployment process is mostly orchestrated by Ansible yaml files, and of course, it’s dead simple: - **On my Macbook Pro**, if it’s to deploy to web servers, then build Javascript bundles and upload to S3. 在我的Macbook Pro上,如果要部署到Web服务器,那么就构建Javascript包并上传到S3。 - **On the target servers**, git clone the listennotes repo to a timestamp-named folder, check out the specific version, and pip install new Python dependencies if any. 在目标服务器上,git克隆listennotes repo到一个以时间戳命名的文件夹,查看具体的版本,如果有的话,pip安装新的Python依赖项。 - **On the target servers**, switch symlink to the above timestamp-named folder and restart servers via supervisorctl. 在目标服务器上,切换symlink到上述时间戳命名的文件夹,并通过supervisorctl重新启动服务器。 正如你所看到的,我没有使用那些花哨的CI工具。我只使用那些简单的、真正有用的东西。 > As you can see, I don’t use those fancy CI tools. Just dead simple things that actually work. ##### Monitoring & alerting 我使用Datadog进行监控和报警。我在一个简单的仪表板上有一些高水平的指标。无论我在这里做什么,都是为了增强我在生产服务器上捣乱时的信心。 > I use [Datadog](https://www.datadoghq.com/) for monitoring & alerting. I’ve got some high level metrics in a simple dashboard. Whatever I do here is to boost my confidence when I am messing around the production servers. 我将Datadog连接到PagerDuty。如果出了问题,PagerDuty会通过电话和短信向我发出警报。 我还使用Rollbar来关注Django代码的健康状况,它将捕捉意外的异常并通过电子邮件和Slack通知我。 > I connect [Datadog](https://www.datadoghq.com/) to PagerDuty. If something goes wrong, [PagerDuty](https://www.pagerduty.com/) will send me alerts via phone call & SMS. > > I also use [Rollbar](https://rollbar.com/) to keep an eye on the health of Django code, which will catch unexpected exceptions and notify me via email & Slack as well. 自2017年初推出以来,除了这次,Listen Notes还没有出现过任何大的故障(大于5分钟)。在这些操作性的东西上,我总是非常小心和实际。网络服务器明显超额配置,以防因新闻事件或其他原因出现巨大的峰值。 > Since launched in early 2017, Listen Notes hasn’t got any big outage (> 5 minutes) except for [this one](https://broadcast.listennotes.com/postmortem-on-apr-22-2018-outage-e5a87723d003). I’m always very careful & practical in these operational stuffs. The web servers are significantly over-provisioned, just in case there’s some huge spike due to press events or whatever. ##### Development 我在旧金山的一个WeWork协同工作空间工作。有些人可能会想,为什么不在家里或在一些随机的咖啡店工作呢?好吧,我非常重视生产力,而且我愿意为生产力投资金钱。我不相信堆积时间有助于软件开发(或任何种类的知识/创造性工作)。我很少在一天内工作超过8小时(对不起,996人)。我想让每一分钟都有价值。因此,我需要一个漂亮且相对昂贵的私人办公室 :) 与其说我在优化花更多时间和省钱,不如说我在优化花更少时间和挣钱 :) > I work in a WeWork coworking space in San Francisco. Some people may wonder why not just work from home or from some random coffee shops. Well, I value productivity a lot and I’m willing to invest money in productivity. I don’t believe piling time helps software development (or any sort of knowledge/creativity work). It’s rare that I work over 8 hours in a day (Sorry, 996 people). I want to make every minute count. Thus, a nice & relatively expensive private office is what I need :) Instead of optimizing for spending more time & saving money, I optimize for spending less time & making money :) 我正在使用MacBook Pro。我在Vagrant + VirtualBox中运行(几乎)相同的基础设施。我使用上述相同的Ansible yaml文件来配置Vagrant内的开发环境。 > I’m using a MacBook Pro. I run the (almost) identical infrastructure inside Vagrant + VirtualBox. I use the same set of Ansible yaml files as described above to provision the development environment inside Vagrant. 我赞同单体仓库的理念。所以只有一个listennotes repo,包含DevOps脚本、前端和后端代码。这个listennotes repo被托管为GitHub的私有 repo。我在主分支上做所有的开发工作。我很少使用特性分支。 > I subscribe to the monolithic repo philosophy. So there’s one and only one listennotes repo, containing DevOps scripts, frontend & backend code. This listennotes repo is hosted as a GitHub private repo. I do all development work on the master branch. I rarely use feature branches. 我通过使用PyCharm编写代码和运行开发服务器(Django runserver和webpack dev server)。是的,我知道,这很无聊。毕竟,它不是Visual Studio Code或Atom或任何很酷的IDE。但PyCharm对我来说很好用。我是个老学究。 > I write code and run the dev servers (Django runserver & webpack dev server) by using PyCharm. Yea, I know, it’s boring. After all, it’s not Visual Studio Code or Atom or whatever cool IDEs. But PyCharm works just fine for me. I’m old school. ## Miscellaneous 有一堆有用的工具和服务,我用它们来建立Listen Notes这个产品和一个公司。 > There are a bunch of useful tools & services that I use to build Listen Notes as a product and a company: - [iTerm2](https://www.iterm2.com/) and [tmux](https://github.com/tmux/tmux/wiki) for the terminal stuffs. - [Notion](https://www.notion.so/) for TODO lists, wiki, taking notes, design documents… - [G Suite](https://gsuite.google.com/) for @listennotes.com email account, calendar, and other Google services. - [MailChimp](http://www.mailchimp.com/monkey-rewards/?utm_source=freemium_newsletter&utm_medium=email&utm_campaign=monkey_rewards&aid=da29e56f1e479faf6b4ef3f72&afl=1) for sending the [monthly email newsletter](https://us16.campaign-archive.com/home/?u=da29e56f1e479faf6b4ef3f72&id=ba72067923). - [Amazon SES](https://aws.amazon.com/ses/) for sending transactional & some marketing emails. - [Gusto](https://gusto.com/r/wenbin) to pay myself and contractors who are not from Upwork. - [Upwork](https://www.upwork.com/) to find contractors. - [Google Ads Manager](https://admanager.google.com/home/) to mange direct sales ads and track performance. - [Carbon Ads](https://www.carbonads.net/) and [BuySellAds](https://www.buysellads.com/) for fallback ads. - [Cloudflare](https://www.cloudflare.com/) for DNS management, CDN, and firewall. - [Zapier](https://zapier.com/) and [Trello](https://trello.com/) to streamline the [podcaster interview](https://www.listennotes.com/interviews/) workflow. - [Medium](https://broadcast.listennotes.com/) for the company blog (obviously). - [Godaddy](https://www.godaddy.com/) and [Namecheap](https://www.namecheap.com/) for domain names. - [Stripe](https://stripe.com/) for getting money from users (primarily for [API](https://www.listennotes.com/api/)). - [Google speech-to-text API](https://cloud.google.com/speech-to-text/) to transcribe episodes. - [Kaiser Permanente](https://healthy.kaiserpermanente.org/) for health insurance. - [Stripe Atlas](https://atlas.stripe.com/) to incorporate Listen Notes, Inc. - [Clerky](https://www.clerky.com/) to generate legal documents for fund raising (SAFE) and hiring contractors who are not from Upwork. - [Quickbooks](https://www.referquickbooks.com/s/Wenbin) for bookkeeping. - [1password](https://1password.com/) to manage login credentials for tons of services - [Brex](http://brex.com/signup?rc=oPLQ0ZQ) for charge card — you can get incremental $5000 AWS credits, which can be applied on top of the AWS credits from WeWork or Stripe Atlas. - [Bonvoy Business Amex Card](http://refer.amex.us/WENBIFIUoH?XLINK=MYCP) — You can earn Marriott Bonvoy points for luxury hotels and flights. It’s the best credit card points for traveling :) - [Capital One Spark](https://www.capitalone.com/small-business-bank/) for checking account. #### Keep calm and carry on… 正如你所看到的,我们正生活在一个创办公司的美好时代。有这么多现成的工具和服务可以为我们节省时间和金钱,提高我们的生产力。现在比以往任何时候都更有可能用一个小团队(或只有一个人),使用简单而枯燥的技术,建立对世界有用的东西。 > As you can see, we are living in a wonderful age to start a company. There are so many off-the-shelf tools and services that save us time & money and increase our productivity. It’s more possible than ever to build something useful to the world with a tiny team (or just one person), using simple & boring technology. 大多数时候,建造和运送东西的最大障碍是过度思考。如果是这样,如果是那样。孩子,你一点都不重要。每个人都在自己的生活中忙碌。没有人关心你和你建造的东西,直到你证明你值得别人关注。即使你把最初的产品发布搞砸了,也很少有人会注意到。大处着眼,小处着手,快速行动。使用枯燥的技术,开始一些简单(甚至是丑陋)的东西,只要你真的能解决问题,这绝对是可以的。 > Most of time, the biggest obstacle of building & shipping things is over thinking. What if this, what if that. Boy, you are not important at all. Everyone is busy in their own life. No one cares about you and the things you build, until you prove that you are worth other people’s attention. Even you screw up the initial product launch, few people will notice. Think big, start small, act fast. It’s absolutely okay to use the boring technology and start something simple (even ugly), as long as you actually solve problems.