Background
The time was 2019, a time when I've got familiar with all the bits and bytes of how things work in our wholesale store. By the way, we do women's fashion.
I tried my best to improve the overall efficiency, making it easier for the clerks to handle different tasks.
One thing caught my eye, which should be dealt with urgently. Too much time would be wasted if not.
The Problem
We are a wholesaler. We open to the whole country. There are lots customer demands that we must cater to, one of which is to hide sensitive information on our product detail sheet each time we sent to the customers with their products.
These customers, they have managers running their stores. They try their best to hide the information about things like where they got the products (to prevent the managers from being able to get the same products if they want to leave and start their own business), what do the products cost (so to hide the margin) etc.
In the old times, the clerks in our store, printed the detail sheet, and grab the scissors, and cut the sensitive information out.
To give a rough idea, the following picture show what a detail sheet looks like.
Red-crossed part contains either our store name, or the address of our store, or the contact information of our personels, or, the price of the products.
So, these customers need all those info being removed. Cutting one detail sheet is viable option, but imagine if there are dozens of it if the customer orders hundreds of products. How much time would that cost?
A Little Analysis Before Down to Code
What's needed to print this sheet?
- The iPad app, to send the printing data.
- The desktop printing server, which listens to receive the data sent by the iPad app and sends it to the printer for printing.
- The iPad and the desktop computer be in the same local network.
Break It Down
It's a rather standard CS model. So I can simply intercept the data transmitted from the iPad to the computer. There's setting in the app where you can set the IP address of the printer computer.
I tried to intercept the request, but it is a bad request, python source code server.py
will simply output the error message bad request
and shows nothing about the data being sent.
OK, that's the first glitch I have to solve. Thanks to that python is open source, I can just tamper with the source code. Let me worry about that later.
Get the Hands Dirty
I didn't think that much. First thought is that, make a minimal viable version, which I share the sheet document (pdf) from the app, and save it to the computer, then process it with some python pdf module, reformat the sheet, and write it to some excel document, and then print it.
Log 11-18-2020
I started the project on Nov 18, 2020. Have to get the user interface ready to work.
Python module to use, Tkinter
.
Log 11-19-2020
Had to find a python module to deal the pdf recognition, turn pdf into some cvs and feed it to pandas
to process.
At the end the day, I got the UI ready. I can select file, point the destination path, and the program converts the pdf to cvs
and feed it to pandas
.
I manipulated the data to get the columns and rows that I want, dropped the others.
Log 11-21-2020
I got an issue. tabula
module relies on java runtime
. If I ever wanted to distribute the program for others to use, I cannot assume that they will have java installed. As a matter of fact, none of them will because java is not pre-installed on any windows computers.
I had to find a way to include java runtime into my program and point my python environment variable to use that java runtime. Thus, no external reliance on java.
I used the jdk and jre from redhat. That run perfectly good. Now I have a standalone program which can handle detail pdf one at a time.
Log 11-22-2020
Selecting files and process one at a time is stupid.
Just save the files on the desktop, and because the filename will be md5sumed to a 32 character long hash and it ends with .pdf
, so I just have to match all that to process all the valid detail pdf files on the desktop, then save the result to a folder.
Log 11-23-2020
Tweaking with the final printing format for a long time. The process is tedious. But still have to do it.
The final version is pretty close to what the app is producing. But I cannot get the page info like Page 1 of 4
working.
I searched for a lot of VB scripts to calculate the total pages, and set the page info on the fly. But all of them fails.
Have to dig deeper.
Log 11-25-2020
Finally nailed that page info thing.
Add the VB script for page info calculating manually in the worksheet first, then use this vb_extract.py
to extract the script to some bin
file that can be called by excel on startup as a macro. Of course, another thing should be installed, the wps2019vba.exe
, to enable macro execution on wps.
I choose wps because it's free 😀
Now, a perfect Page 1 of 3
will be printed on the bottom right corner.
Log 11-26-2020
Started to work on the server side. Set the database schema, create the database and tables.
I need to store the user who uses my program. Theses information can be used for authentication if later on I decide to distribute my program.
I need to store the printed data a user prints each time for later reference. It's like a printing history.
I also needed to store the summary information of each print, which can be referenced later like an annual report. How many clothes have I sent to my clients? OH, check out this summary part, total amount of clothing and revenue should be presented.
Log 11-27-2020
The database and tables are done.
Created project Docker-LEMP
with docker-compose
to fast deploy LEMP docker environment for testing. The python client will connect to production server or testing server accordingly.
Now get to work out the server side scripts.
Finish the insert script first to save printing data (detail of the products of each print, quantity, price, timestamp etc...).
A little testing with the insertion.
Now working on the query part. Just write a python script for the request to test if the script is OK. Later I'm going to do it in swift in the iOS app.
Log 11-28-2020
Have to work on the printing part. Test print and see if the format still needs to be adjusted.
Done with printing and formatting.
Check the data in database.
Everything looks good.
Log 11-29-2020
Refactoring code on both sides. Modify code structures, create utils and base classes.
Have to think about registration. People doesn't know much about computers so I have to take care of registration by myself. That being said, I have to design a register process separated from the program itself.
Design a register html page, get information needed and register the user.
I never rule out the possibility of distributing commercially, so some ACL
is required.
Implement the registering server side script.
A little registering test.
Next part is user authentication and membership expiration.
Log 12-01-2020
Registered user will be sitting in the wait list, they are going to be activated in the future when they start to use the program.
Implement user authentication (login) on client side. User has to provide his phone number (the one used in registration) to complete the auth process.
And I want to limit the use of my program to only the exact computer that the user is activating on. Simply saying, once activated, the program is bond to this computer only, copying my program to other computers won't work.
Now Implement the membership expiration time script on server side. Show the days left on the menu bar of the client. Upon activation, the user's got 365 days (a year) membership.
A little test.
Now it's just a matter of deciding the price 😀
Log 12-02-2020
Design a upgrading module.
Check for updates periodically. If there is a newer version, prompt the user for permission to upgrade. If yes, start the updating process, kill the main process, unzip the downloaded file and start the main process again.
Separate the update module and auth module from main module.
Separate the pdf converting and printing module too.
Log 12-04-2020
The upgrade module requires a lot of testing.
Write a About
module too. That can be as simple as possible.
Log 12-05-2020
It's been half a month. What I have to do now is to find a way to minimize the printing path, meaning that the clerks won't have to bother share the file, save it on computer, and click a button in the program and then get the printed stuff. It should just share the document from inside the app, through some kind of other app, and the request will be sent to the computer, and bang, the document just gets printed.
All a clerk has to do is first query for the data in the app she needs, share it, and that's it.
Find a way to do that.
Log 12-06-2020
The best way to do that is by the content sharing feature of iOS. It's like in Android, you press the share button, a system wide notification will be sent, calling out for those apps who can handle this type of sharing.
I have to write an app for sharing pdf documents, and send the data to my computer.
I also have to implement a server on the client side to receive the data, then do the same process.
Log 12-07-2020
Create project Sharebly
. The rough idea is to configure what I want to include in the final printing (customer's name, price, side notes etc...).
I have to write a share view with lots of switches for different items that I want to include.
Have to register an apple development account first.
99$ 😀
Have to get familiar with some swift programming.
Log 12-10-2020
The UI for my app is done.
Having some trouble transmitting the pdf data. When saving the encoded data back to pdf, it corrupted. Some thing wrong with the encoding. Reference here.
Nailed the transmission part. I can save the pdf and process it.
Log 12-12-2020
Did a lot of testing for the app. Make sure that documents shared through my app can be properly received and printed.
A small problem in python's built in http server, after some time it just goes to hibernate or something, it can't receive any request, but after a few times trying, it's working again, not so reliable. I should consider dump that for gunicorn
when go into production.
Log 12-13-2020
Some more refactoring. Both python client, iOS app and server side scripts.
Add more feature to the membership expiration function. User must contact me, pay the fee to renew their subscription for another year.
Test for the expiration, when expired, app will not start. Consider start a background service, checking for expiration every couple of hours. Add that to TODO list.
Log 12-17-2020
It's time to deal with the python http server bad reqeust problem. I have to get the data sent by the app, because that data contains lots of information about the user so I don't have to maintain those information on my own, I just get them from the app, convenient.
Looking through the source code of in /path/to/python/Lib/http/server.py
, I finally tracked it down to the handle_one_request
method. It's going to invoke the parse_request
method, which just parse the incoming request and match a lot of rules against it to see if its a valid request. If not, parse_request
will return false
, and the handle_one_request
method will do nothing but return.
My python client server will be inheriting the BaseHttpRequestHandler
, so what I can just define a method in my client, grab the raw_requestline
, which is the request sent by the app, and parse it anyway. I'll name the custom method do_push_request_line
.
Then, add self.do_push_reqeust_line()
to handle_one_request
method before it returns on false request parsing.
Replace the python source file with my new file.
Problem solved. What I intercepted is xml data. So what I have to do is to parse it with xml module.
Now, grab whatever information I want from the data and put it in my final excel document.
Log 12-20-2020
Tested for couple of days.
Add icon image to the app.
Separate default configuration to a file.
Log 12-22-2020
Thinking about how to ship the program to end user. Of course I cannot copy them the source code, stupid.
Dig a little. I'll use Pyinstaller
to bundle all modules into exe
file and resources.
Fumbling around with pyinstaller to get it working first.
Log 12-23-2020
Log down pyinstaller command, test it on vanilla windows virtual machine, where nothing is installed, no python, no java, no nothing. Make sure every resource path is correct.
Log 12-24-2020
Things are getting busy in the warehouse. I have to help til very late. No time for the server upgrade.
A workaround for the seemingly hibernation behavior of the python http server, I just start a process and request for a 404 page on the server every 45 seconds, acting as a heartbeat connection.
Log 12-25-2020
Final stages.
Think about automating the whole building and testing process.
Building first.
Write a script to handle all the building and zipping. Like a local CI, which generates a latest version zip file of the program.
Just send it to authenticated user, extract to use.
Log 12-26-2020
No time to write any testing. Leave that as it is for now.
Test updating again. Make sure everything works.
Log 01-12-2021
Final clean up. Run the v1.1
branch in the warehouse now.
Summary
The program helps a lot, after a few click, what used to be done in minutes took just seconds.