Background
Before we start, let me give you some context here.
What is Data Service
Data Service is a Python middle layer between our application services and data layer, which provides controlled and efficient MySQL access or 3rd Party API calling for important Authentication and Payment services. By setting up Data Service in the middle, it helps to:
- Protect MySQL database from large amount of connections during high QPS and heavy client load.
- Reduce unexpected risks caused by abnormal client behaviours by adding rate limit for each API
- Separate 3rd Party API calling and DB access to minimun impact when any of 3rd Party API is not healthy.
Why new Data Service
- As a interpreted language, the speed of Python is always a problem when it is compared with other Programming languages.
- Python’s memory consumption is very high, maybe due to the flexibility to the data types. And the memory usage keeps increasing along the time, which is a headache to our DevOps.
- As engineers, we should always keep exploring new technologies and improve our services, do not wait until it’s too late.
Why Golang
I do not want to discuss about what is difference between Python and Golang or whether Golang will replace Python here. To be honest, Golang is currently one of the fastest growing programming languages in the software industry, we have chosen Golang for the following reasons:
- Golang offers a native concurrency model, which makes it easy to progress with multi-tasks, especially for the user senario of Data Service
- Golang’s syntax is small compared to other languages thus making it easy to learn.
- Golang’s small syntax and concurrency model make it a really fast programming language. And Golang links all the dependency libraries into a single binary file thus eliminating the dependency on servers.
- Golang has a fairly mature package of its own. Once you install Go, you can build production level software that can cover a wide range of use cases before needing to consider any third party packages.
- Golang community is so active that you don’t need to reinvent the wheel.
Design
It is always difficult when you get started from scratch, especially when you know nothing about it. Suddenly you found there are too many things you need to think carefully before taking any action. After some struggling time, I am on the way. Looking back, I want to note down some of the big choices I made and explain the reasons.
Data protocol
Since we are writing replacement service to existing Python Data Service. To minimize changes, we will keep the current using JSON protocol, so that we do not need to update any code in our fastcgi application such as sso_website
or app_point
. gRPC is awesome, but not our choice this time.
Golang convensions
When you come to a new Programming language, it is important to follow its own convensions and best practice, even some of them are not fit with our team style.
- Package name should be short enough, and avoid underscore or dash. That’s why you can see
gocommon
,dataserver
instead ofgo_common
,data_server
etc. - Camel-Case is preferred in Golang.
- Golang prefers JSON format log instead of plain text. Here is an example.
{"level":"info","ts":"2020-02-19T14:33:55.372+0800","caller":"router/router.go:122","msg":"handle_packet_success","local":"127.0.0.1:21002","remote":"127.0.0.1:59834","id":"12","api":"app_point.execute_payment","app_id":10000,"timestamp":1582094035,"request":{"app_id":10001,"platform":1,"app_role_id":2,"uid":200001,"app_txn_id":"12345678.432432","txn_status":2,"point_amount":1,"item_id":1,"item_quantity":1,"ip":12312312,"action_country":"SG","memo":"","ext_data":"{\"app_remark\": \"this is just for test\"}"},"reply":{"id":"12","error":"","result":{"txn_id":"6054651918534735383","point_balance":"1045"}},"latency":96.924803}
Indeed, JSON log is much easier to be parsed by log collector.
Synchronize or Asynchronize
At very beginning, I was intended to design the tcp server with async workers to handle requests, which is cool. Every requests will be put in a task queue, and there is a goroutine to fetch from queue and dispatch new goroutine to handle the task, and return the data to the initial connection. Later when I finished the code and running load test, I found the result is not that ideal as expected. Why? What happened? Here are some take aways:
- It is not helpful to do async worker in such senario, because even server side is handling task asynchronizely, the client side still need to wait for the response.
- The coolest feature in Golang is to let developer writing aysnc programming with sync way, and leave the async mechanism to Go scheduler, it will spawn goroutine to help you achieve that. As a Golang user, you can just do whatever you are good at: write the correct logic.
Finally, I changed the server to sync worker, with every single request from client, server will spawn a new goroutine to handle the task and return data to client.
ORM or not?
To be honest, ORM makes life easier. With a powerful ORM in Django, you never need to worry too much about writing codes. But of course, junior developers may make mistakes when using ORM, writing less efficienct code.
In Golang, I have expored some ORM and fail to find out one can meet my requirements: Easy-to-use, Fast and Support sharding tables. As a result, I wrote a very simple wrapper to Golang’s MySQL lib and come out with my own ORM, you can check out gocommon
if you are interested.
Let me explain some benefits if you are using raw SQL:
- Fast, this is for sure. There is no data type reflection, which takes most of time in all ORMs.
- You know clearly what you are writing when implementing the logic. There is no trick in the SQL command.
While GoORM and SQLBoiler are popular ORMs in Golang, it definetely deserves us to explore more in future.
Monitoring
As I metioned earlier, Golang prefer JSON log format, which means we can’t re-use our metrics collector. It is time to try out promethues monitoring. This is simple easy if you are writing a new service. I also created common lib in gocommon
, you can use it easily anytime you want.
Performance
After the first round of load testing, here is the testing result under 3000 RPS for app_point
service
Python Data Service | Golang Data Service | Golang Data Service with Prometheus | |
---|---|---|---|
Memory | around 8G | around 200MB | around 650MB |
CPU | around 17 | around 6 | around 6.5 |
What’s next
We will keep improving Golang Data Service and gradually migrate services from Python Data Service to it.
Stay tuned for more posts!