GHC 2019-04-10

10 comments.

, https://git.io/fjqsr in jarun/googler
Better explain "Connection blocked due to unusual activity."
============================================================

Before:

```console
$ googler "I use googler for automated scraping."
[ERROR] Connection blocked due to unusual activity.
```

After:

```console
[ERROR] Connection blocked due to unusual activity. THIS IS NOT A BUG, please do NOT report it as a bug unless you have specific information that may lead to the development of a workaround. You IP address is temporarily or permanently blocked by Google and requires reCAPTCHA-solving to use the service, which googler is not capable of. Possible causes include issuing too many queries in a short time frame, or operating from a shared / low reputation IP with a history of abuse. Please do NOT use googler for automated scraping.
```

, https://git.io/fjq3k in jarun/googler
curl has loads of options too, with many groups of options doing roughly the same things with very slight variations or some options being very specific shortcuts for others. It's of course nice to keep everything pure and simple, but for mere mortals practicing iterative designs, not unnecessarily breaking existing usage is probably more important than purity :)

, https://git.io/fjqOT in jarun/googler
—nocolor and —colorize without argument are.complete opposites, so it might.not be a great idea to rebrand -C. The point of —colorize=auto is that it should do the right thing 99% of the time, so you basically never need —colorize. We are basically reducing most -C use cases (piping) to the default. For the remaining 1%, —nocolor as a shortcut should still be the prevalent use case.

, https://git.io/fjqOk in jarun/googler
My decisions should be adequately explained above. One thing I forgot to mention: I dropped the anniversary update version check because this sort of string comparison is not robust at all. If SetConsoleMode doesn’t work, so be it, we just fallback to default behavior.

, https://git.io/fjqOI in jarun/googler
@guilt Your request is now implemented with some changes, thanks.

, https://git.io/fjqmo in jarun/googler
Only difference I can think of: on Ubuntu bionic I have Python 3.6.7 with openssl 1.1.0g-2ubuntu4.3, on python:3.6-slim I have Python 3.6.8 with openssl 1.1.0j-1~deb9u1, so there could be a cipher suite difference.

, https://git.io/fjqmK in jarun/googler
Amusingly, on my Linode server (Ubuntu bionic), when I run googler directly from Ubuntu, the requests are perma blocked, but when I run googler tests from within an ~alpine~ slim-stretch container with Docker on the exact same host, the requests are not blocked. Should be the exact same IP address, the exact same request and the exact same headers. What a mystery.

, https://git.io/fjqm6 in jarun/googler
I noticed PR building isn't turned on for CircleCI. I turned it on at https://circleci.com/gh/jarun/googler/edit#advanced-settings, but I don't want to enable "Pass secrets to builds from forked pull requests", since as they say a malicious PR could otherwise reveal secrets... We should really move `NUM_TEST_ITERATIONS` and `SLEEP_DURATION` to `config.yml` instead. (While we're at this, I realize I promised a Travis purge, so this will be part of that. It will have to wait until later tonight or tomorrow though.)

, https://git.io/fjqmi in jarun/googler
By the way, sorry about the delay. I didn't do this last night because (1) you never know how much time you'll spend on learning/tinking with Windows stuff (turns out I spent a fair bit of time, even though I have a reasonable level of experience with using ctypes on Linux); (2) I need a clear head to make non-technical UX decisions.

, https://git.io/fjqmP in jarun/googler
Smarter colorization and better support for native terminals on Windows
=======================================================================

This is really two related PRs crammed into one so as to not create any branch dependency.

Let me paste in the commit messages first:

## Introduce smarter coloring and `--colorize` option

By default, do not colorize when stdout is not a tty. The --colorize option,
with possible values 'auto', 'always', and 'never' -- defaults to 'auto' when
not specified, and registers as 'always' when specified without a value -- is
similar to the --color option found in many well known *nix tools, with
coreutils ls(1) being a notable example. (Our option is named --colorize since
we already have a --colors option.) The old -C, --nocolor option is now a
shortcut for --colorize=never.

Limitation: due to limitations of Python argparse, specifically not being able
to make the equal sign between a long option and its value mandatory, one might
expect the following to work:

    googler --colorize google

but it does not, due to "google" being parsed as an argument to --colorize. One
has to write

    googler --colorize -- google

instead.

*Note: Initially I considered "smartly" enabling/disabling coloring on Windows consoles, so it is only fair to pick the cross-platform low-hanging fruit in terms of smart colorization, i.e., tty detection, first. Upon further consideration I rejected the previous decision, but this change was already made, and it seems okay to leave it as is.*

## Enable ANSI color in cmd and PowerShell on Windows 10

VT100 control sequences are supported in cmd and PowerShell starting from
Windows 10 Anniversary Update, but only if the
ENABLE_VIRTUAL_TERMINAL_PROCESSING flag is set on the screen buffer handle
using SetConsoleMode.

Setting this flag does not seem to negatively affect third party terminal
emulators with native ANSI support such as ConEmu, so hopefully there's no
regression.

References:
https://docs.microsoft.com/en-us/windows/console/console-virtual-terminal-sequences
https://docs.microsoft.com/en-us/windows/console/setconsolemode

Credits:
https://github.com/jarun/googler/pull/275
https://github.com/tartley/colorama/pull/139

## Further explanations on some decisions

Not included in the commit messages, but worth mentioning:

- I tested this on latest Windows 10 (Pro 1809, 17763.379) native cmd.exe, PowerShell.exe and latest ConEmu preview build. I'll just assume it works in other third party terminal emulators too. I won't bother to test this on Windows 7. And no one uses Windows 8 anyway.

- The key difference between this and #275 is that we do not disable color on win32 just because it's not Windows 10 or because the ctypes code failed for whatever reason. Sure, you might still see garbage by default when using googler on Windows 7 or pre-1607 Windows 10 native cmd/PowerShell, but people using an ANSI-capable terminal emulator on those systems aren't suddenly punished and have their color stripped. Progressive enhancement fixing part of the issue is better than aiming for correctness at the cost of introducing regressions.

  By the way, I do hope poor souls stuck on Windows at least help themselves with a more modern terminal emulator, instead of MS crap that values compatibility with DOS more than usability in 2019.

- Speaking of fixing part of the issue: what about Windows 7 or pre-1607 Windows 10? I'm not even sure pre-1607 Windows 10 exists at this point, but Windows 7 native console certainly isn't targeted by this fix. I looked at how colorama does this, and it's ugly as hell: they have to hijack the streams, extract the control sequences and translate them into win32 calls between writes. Not surprisingly, it's not Unix, so text is not the universal interface, erratic win32 calls are.

  Since googler prides itself on being a single script without dependencies, we have to cut corners here instead of implementing a full colorama-type solution here -- we certainly don't want to introduce and maintain 500 loc for 0.01% of users.