I find myself wanting to play around with the Stack Exchange API, partly as an excuse to learn Laravel, but sadly my first use-case turned out not to be supported by the API. So I thought I might build a bad posts detector, using the search call. The idea is that community editors can see at a glance which posts are most in need of repair (and optionally down/close votes).
I’ll need a set of tables to store the “bad phrase” queries, each of which will be run in turn, with a set of question IDs being returned in each case. At the end of a run, questions can have a score totalled, with higher scores indicating higher levels of bad writing. The API permits 10,000 calls per day, which should be enough for a few full refreshes.
|description||VARCHAR||No||Describes the purpose of this query|
|post_type||ENUM(question, answer)||Null means no type restriction|
|set_id||INTEGER||Useful where we should take the highest score of a set only|
|max_results||INTEGER||No||Limit of how many posts to grab|
|is_enabled||BOOLEAN||No||Defaults to true, disabled if false|
|description||VARCHAR||No||Describes the purpose of this set|
|run_stats||run_id||INTEGER (FK)||No||Composite PK|
|query_id||INTEGER (FK)||No||Composite PK|
|completed_pages||INTEGER||No||Starts at zero, equals page count when completed|
|answer_id||INTEGER||Not null if the found item is an answer|
The last two tables probably need a bit of expansion:
- run_stats: meta data per query per run
- run_item: a found item for a query and run
Some examples of independent searches:
Some examples of search sets (which are equivalent but may be scored differently if required):
- please help me out > please help me > please help = help me please > help please
At present the score of a question or answer would be determined by a SUM() query with a GROUP BY, though if this gets excessively slow, query results can be snapshotted to a new table.
The UI of an app would show the latest results, in reverse score order, and show how that score is achieved. It will also list all live searches, and for an admin, it will allow searches to be edited. Easy!