Semidefinite thoughts

What I learned from being QIP PC chair

I was recently chair of the program committee for QIP 2018, and since I started the process by writing past chairs for advice, I thought I’d write down my own thoughts with the hope that they might be useful later. I was inspired in part by Boaz Barak’s reflections on being PC chair which contains some great ideas like “shell reviews” as well as a helpful overview of the whole process.

Before the talk submission deadline

One piece of advice from Boaz that I can’t repeat enough is to start everything early, including getting invited early to be PC chair. ;-) The early stages of the process involve inviting the PC, some back and forth with SC (steering committee) and local organizers about dates and policies, writing a call for papers (CFP), and distributing it widely. If you want to make any significant changes to last year’s CFP then you’ll need the SC’s feedback/approval. And if the changes are big enough that they’ll affect whether someone is willing to be on the PC, then you’ll also want these changes to be settled before you send out invitations to the PC. This early stage isn’t a lot of work but it does involve a lot of chasing busy people, writing emails in ways where it’s clear how to reply, and generally staying on top of things. In your invitation to the PC, you’ll want to set not only the “external” dates (like paper/poster submission/notification deadlines) but also various internal deadlines, such as when paper bids are due. Even internal deadlines need to specify the time of day and timezone, in part to communicate how important the deadlines are (if one person hasn’t done their paper bids, then no one can start reviewing).

Software. One thing to think about during this period is conference management software. The usual one for our community is easychair. Others like HotCRP might be better because of their greater flexibility (e.g. you could make tags like “q info theory papers needing discussion” to coordinate discussion) but then the PC will be less familiar with it. If you do easychair, you’ll probably want to get the local organizers to pay for at least the “professional” license, but even this doesn’t get you any support from easychair.org. It’s worth spending a little while playing around with the software in this early stage especially if you have been a PC member often but not a PC chair. Also, the software can change. For example, many of us were surprised that easychair dropped the step where PC members could approve subreviews before they went live.

Another software choice which would be worth figuring out is one to schedule parallel sesssions. Chris Schaffner pointed out to me that the Chaos Computer Congress has software for this purpose (described here). In my case, I didn’t think of this until we had basically decided on the list of accepted talks, and it was too late for a process that involved waiting for people’s feedback, so I just made the tracks manually. The feedback I got about the parallel sessions was mostly positive but if you remember any problems with it, please email me or post it here.

PC selection. You’ll want diversity in region, seniority, demographics, and especially topic. Choosing the right topic balance is tricky. In my case, I expected more submissions in delegation/verification/validation and fewer in thermodynamics/resource theories/optics, among other mistakes. It is worth studying past PCs and asking their chairs about how the topic balance worked for them. Speaking of past PCs, you probably want a few people who have access to the reviews from the past few QIPs but otherwise it’s best to bring in people who haven’t been on the QIP PC very recently.

Co-chair and COI policy. Normally PC members can submit papers themselves, and historically this hasn’t been a problem. The QIP charter allows even the PC chair to submit papers, and you’ll have to decide on a process for this, or decide not to submit anything. My process was to choose a co-chair (Ronald de Wolf) who handled my submissions, and to be careful only to look at the list of papers when logged in as “track chair” and not “super-chair” (these are options when for multitrack conferences on easychair). I think this went fine (all my papers were rejected :)) but you could make arguments either way here.

Another question is how to handle same-institution CoI. I used an easychair plugin that automatically marked papers as CoI if the authors were at the same institution, but this was imperfect in a few ways. On a technical level, the system often had out-of-date information about institution. But also, at some large institutions, you might almost never see someone who is technically your colleague there. On the other hand, when you see everyone’s CoIs, you realize that people make a lot of mistakes. I don’t have a great answer to this, other than generally trusting the judgment and integrity of the PC members.

Policy questions. Of course everyone’s favorite question is the 3-page abstract. I think they should be dropped and that this will save authors time without much negative impact on reviewers, who have always been allowed to skim. Others disagree for reasons including: reducing submissions where authors don’t make an effort, helping people write their submissions in a way that they are likely to be accepted (especially those who don’t regularly get papers into QIP), and helping PC members who weren’t initially assigned to a paper. (Perhaps one should also survey authors.) For QIP 2018, I (with the permission of the SC) officially dropped the need for an extended abstract. To compromise with some members of the PC who liked the 3-page abstracts, the CFP (see here for wording) still treats the first three pages of any longer submission somewhat like the old extended abstracts, making the change ultimately fairly mild. In my opinion, we should tell authors to communicate their ideas as clearly and efficiently as possible, without explicit page limits. But future PC chairs should at least think about the submission format issues early and discuss with a range of people. Before writing this post, I surveyed the QIP 2018 PC and there was still a range of opinion (4 liked the requirement and 4 didn’t). So I imagine that the QIP 2019 PC will be talking about this too!

Another new policy in QIP 2018 was to allow anonymous submissions. This raises various practical issues, such as figuring out who has a COI. However, in practice there were zero anonymous submissions. It may be that many papers are anyway on the arxiv, or that the issues of bias are that some names help a paper rather than that some names hurt a paper. This post claims that in one study, double-blind reviewing didn’t improve the fate of papers by female authors. I’ve seen elsewhere evidence in favor of double-blind reviewing (I can’t remember where) but it does seem like there is not much buy-in from the QIP community.

Subreview policy. It’s worth being explicit in advance about the role of subreviewers. Subreviewers let the PC members save time, can take advantage of expertise that’s not on the PC, and can sometimes read the long version of submissions in greater detail. However, they might be biased to favor their own subfield/friends and they may have trouble calibrating the scores on their review because they aren’t comparing to other papers from this year and/or they don’t have enough QIP experience to know what the typical acceptance thresholds are.

With hindsight, I think the right policy is to not have subreviewers enter scores. Easychair doesn’t exactly have this functionality but I realized partway through the process that I could allow blank scores, and could encourage subreviewers to use this.

The reason I think this is a good idea is that it forces the PC member to come up with a score. I had asked PC members to use subreviewers to help them understand a paper but to still come to their own opinion of the paper which they would be able to defend in discussion. This sometimes happened but often didn’t. A big part of what I did during the discussion period was to prod PC members to respond to questions, clarify reviews and generally engage in discussion. This seemed to be most necessary when a long detailed subreview had a score that was far from the scores of other reviewers, and the PC member seemed to not personally have endorsed that score.

Bids. I had 36 PC members and 296 talk submissions. I asked each PC member for 50 yes+maybe bids. With hindsight I would’ve asked for more. 50 was enough in most cases, but sometimes half the PC bids for the same paper, and other papers get only 1-2 bids total.

During the discussion period
Moderating debate. My main job during the discussion phase was to prod PC members to reply to each other’s points and generally to keep the debate moving. I’d also summarize the state of the discussion (“looks like most people are in favor of this paper, but reviewer A has some concerns which I see the others haven’t addressed, and which look serious. B,C, what do you think?”) and sometimes invite in more reviewers. Here I tried to keep my own opinions fairly mild, and even if I suspected that I disagreed with people’s points, I would instead invite in more PC members.

However, I also assigned myself some papers (about half as many as the typical PC member) and reviewed them like the rest of the PC. This meant in one case, strongly disagreeing with another PC member’s opinion even after several rounds of back-and-forth discussion. So far, so good, since questions of significance, potential utility, and surprise are all fairly subjective. The problem is that this meant I couldn’t also be a moderator in that discussion, in a case where someone needed to moderate. In this case, I asked Ronald to moderate the discussion. However, with hindsight, I should have asked Ronald to moderate all the papers that I assigned myself to review, to prevent me from ever having to play both roles.

Most of the papers I reviewed were considered clearly below the bar. I think this is because people mostly bid on the papers they thought they would like, and I ended up taking many of the papers that didn’t have enough bids. So this mostly wasn’t an issue, but using the co-chair in this way is something I’d recommend future PC chairs do.

Timing. One area I definitely screwed up was the period 1-2 weeks into the discussion phase, when I didn’t push the PC quickly enough to focus on borderline papers. I was happy with the quality of the discussion in the end, but the final weeks felt more rushed than they should’ve been. Ronald suggested I move 5 papers to ‘accept?’ and 20 papers to ‘reject?’ each day, always with the option of moving them back to undecided if someone protests. I didn’t do this, but should have.

More generally, some PC members will use discussion phase to actively look around for things that seem wrong, and others will wait until prodded. As a PC member, I was one of the active ones, and I mostly felt like I interacted with active PC members, but as chair, you see the entire range. One way of prodding someone is an easychair comment (although these emails might sometimes get filtered), but another way is to start marking papers as provisionally accepted/rejected.

Some stats

Here, for posterity, are some of the stats that were presented at the business meeting. The headline number is the 20.6% acceptance rate: 61 accepted out of 296 submissions. After 4 merges, this meant 57 talks. There were 256 posters accepted and not withdrawn, starting from 133 poster-only submissions, 168 from talk-or-poster submission, minus one only rejected poster.

Timing

  • Within last minute of deadline: 1
  • Within last hour of deadline: 15
  • Within last day of deadline: 125
  • After deadline: 3
  • More than one week early: 32

PC output
There were 3.15 reviews/submission, meaning about 26 reviews per PC member (half that for me and Ronald). Of these, 10.4 on average were subreviews. In total, the PC wrote 285526 words of review + 91530 words of PC-only discussion , which comes to about 10000 words/reviewer, or 1250 words/submission.

Questions for the future
The process brought up several issues that could use more discussion and perhaps a clear decision from the SC/community would be useful.

Timeliness and resubmissions
The CFP says contributions should be “outstanding recent research contributions” but what exactly does “recent” mean? Can someone resubmit a paper rejected from QIP X to QIP X+1? My view (and the standard I pushed in QIP 2018) is that we shouldn’t have a hard cutoff but once something is more than a year old then most people in its target audience already know about it so the threshold should be considered higher. I don’t think we should be strict because sometimes papers get overlooked and we also don’t want to discourage early arXiv submissions or rewriting. According to this philosophy, we can consider not only the number of months a paper has been publicly available but also how much it has appeared at previous meetings or otherwise had widespread exposure. I also think resubmissions are also ok, given the inherent noise in this process, but there should be enough overlap between PCs that people can see last year’s arguments against any given paper. Regardless of the policy here, it should be spelled out a bit more in the CFP.

Experimental work
QIP has historically been a (mostly) theory conference but as experimental progress in QC advances, should QIP start accepting more experimental talks? Or should we just have 1-2 invited experimental talks?

My guidance to the PC in 2018 was to say that we don’t have a strict rule but that the audience is a theorist audience, so we should evaluate experimental talks in terms of what would interest this audience. This will automatically rule out reports on the latest increases in qubit number and fidelity, but might include things like a new architecture, or experimental work that raises or settles new theory questions.

While I think this approach makes sense and allows some flexibility without opening the floodgates, there are reasons to prefer a more clearly defined approach. Some PC members would appreciate it to help them evaluate submissions. Also, it is arguably unfair to experimentalists interested in submitting to QIP if most of them mistakenly think that the conference is closed to them, or if they waste their time formatting their submissions for a conference that doesn’t evaluate their work.

Similar issues arise for other work whose relevance to QIP is uncertain. Some PC members wanted more guidance here although I think that the field is still too dynamic for clear rules to be possible.

QIP discussions and later reviewing.
Some of us have pushed for open reviewing and the buy-in has been only partial, since reviewers already do a lot of work but feel that a public review would need to be written to a higher standard (among other reasons). Still, the reviews are valuable resources. One way to use them is to informally connect editors for journals to either QIP reviewers or at least members of the PC who would have access to the QIP reviews. This gets around the one-shot nature of conference reviewing by introducing some continuity. One drawbacks is the risk of introducing correlated errors, or more generally, amplifying some opinions that were developed perhaps hastily. In any case, this will not happen often, since only PC members even know about which papers were rejected from QIP. I mention it in case the community has any feedback on this.

More thoughts from the PC
In writing this post, I polled the PC for their thoughts on the process. In no particular order, here are some of the things they said (not all relevant to the PC process). My comments are in italics.

  • Bring back the rump session! I agree! Although it was getting long and evenings were getting full. People are always allowed to skip things.
  • Cheaper conference dinner. It may not be easy to “just order pizza and get some kegs” for 500 people but I also would vote for something more informal, perhaps where you can walk around and talk to different people.
    • Update on 23 July: This is Aram again now. This post was mostly about the PC process but I’ll say a little about local organizing. Local organizing is super time consuming, especially as QIP gets bigger, and in general we should all be grateful for whatever we get even if we think we ourselves would’ve done it differently (except for that boat show). So after posting the above complaints, I should also mention that I’m really grateful to all the local organizers for their hard, stressful and inadequately thanked work. It seems that conference dinners (and rump sessions, and everything else outside the official venue) are usually hard to do cheaply for very large groups. I do think that when conference dinners are expensive it’s good to have their price not folded into the registration fee, even though this will inevitably draw more complaints.
  • Wow there are a lot of clearly subpar submissions. Maybe we should restrict submissions somehow? I don’t think excessive submissions are a problem but a few different PC members brought it up. Perhaps there could be mechanisms to reduce the # of independent reviews for submissions that are clearly below threshold?
  • Some PC members wanted more clarity on some of the above issues, like our approach to experimental work, older submissions, and submissions that were seen as deviating from the guidelines. Others (like me) were happy with flexibility.
  • Along with a low acceptance rate, there is a concern about “outsiders” having trouble meeting the ever-evolving QIP standards. This one of the arguments given for requiring 3-page abstracts + full papers, which do generally tend to guide submissions towards a format more likely to be accepted. I agree with this but think that there is also scope for putting more “non-binding” advice in the CFP, which can help authors without forcing either authors or referees to be too rigid.

Update on 30 April, 2019

I’ll add two more ideas.

  • In addition to assigning each paper to 3 reviewers, also assign one PC member as an “observer” to each paper. The observer acts like a PC chair just for that paper and doesn’t have to read or review the paper but should read the other reviews and do things like: tell reviewers if their score doesn’t seem to match their review text or generally if their reviews are missing something like a discussion of significance, prod disagreeing reviewers to answer each other’s points, solicit more opinions when needed, and when necessary, tell the PC chair to pay attention to a discussion. Observers don’t have to be experts in that paper, and in fact could help enforce another idea, which is that reviews should explain why the paper is interesting to the broader QIP community.

    The advantage of delegating this observer task is to spread the workload of the PC chair, who can often be a weak link in the discussion phase. I remember wading through hundreds of easychair notifications and writing many many quick notes asking reviewers to clarify their scores, take another look at contrary options, or generally to re-open discussion. The whole process, including skimming the reviews, might take just a few minutes, but might make the difference between the discussion continuing or dying out. Of course every PC member could be doing this without being officially appointed ‘observer’ but in practice most of the PC will do a good job of what they’re asked but will not seek out places to do extra work. The observer role may also be helpful when there are strong disagreements between PC members. In this case, it’s nice to have someone who can politely moderate without taking an opinion of their own. This recommendation would apply also to conferences like TQC or AQIS where there is a relatively short discussion period.

  • If we’re going to keep 3-page abstracts (which I don’t love but others do) we should post 5-10 examples of good ones from past years, with permission of the authors. These don’t have to be the best papers but should be the ones that exemplify the unique requirements of the format. This is to address the problem that some people don’t know how to present their work in a way that works well with the conference reviewing process. The advantage of doing this over posting all 3-page abstracts is that it reduces the burden on the authors (writing a 3-page abstract for the PC is easier if you don’t have to also write it for the public) and anyway people only need 5-10 examples, not 60.

You can comment below but most people seem to be using Twitter for this.

1 Comment

Debbie Leung   2018-09-10 02:34:08

Thanks so much for putting these thoughts together and making them publicly available. They are great resources for future PC chairs. The comments also make the review process more transparent for the QIP community. I hope this starts a good tradition and in the long term, it may be worth-while to take a small amount of funding from QIP to host a permanent website for such valuable information.

  • On anonymous submission: personally I am in favor of making it mandatory. To be more precise, I am in favor of taking the authors’ identities out of most of the review process. I will give two reasons, suggest some solutions, and discuss some potential problems.

The first motivation is to take the competition closer to a level playing field. While reviews are conducted in good faith, there can still be implicit biases, possibly systematically favoring more well-known researchers. It is true that authorship information is often publicly available, and the reviewers could have seen presentations of the results before the reviews; however, an active reminder on the authors’ identities seems counterproductive.

The second motivation is to improve the discussion. While reviewers with clear COIs are excluded from the discussion of the corresponding submissions, for a small field, a typical reviewer can be involved in a substantial number of discussions concerning several close former collaborators or groupmates. I have recently heard from a colleague that he/she will be uncomfortable discussing a paper in the presence of a close ally of the authors. I think not having the authors’ information can help.

Of course, anonymous submissions will not completely resolve the above problems, but I hope that are some first steps.

The downside, of course, is the logistics. I would wish there is a way for Easychair to manage COI at the level of assigning the submissions. But then if the reviewers seek subreviewers, I am afraid the most obvious candidates are the authors themselves! I do not know how this is being handled elsewhere, but there are other CS conferences that have adopted the mandatory anonymous system (including Crypto) so perhaps it is worth investigating. Alternatively, the chair and co-chairs can be a trusted party to oversee accidental COIs.

Our situation is close to some theoretical CS conferences, so, perhaps we can borrow some of their thoughts and lessons, such as discussed in the following:

https://windowsontheory.org/2018/01/11/on-double-blind-reviews-in-theory-conferences/ http://mybiasedcoin.blogspot.com/2018/01/double-blind-alenex.html http://blog.geomblog.org/2018/01/report-on-double-blind-reviewing-in.html http://blog.geomblog.org/2018/01/double-blind-review-at-theory.html

There are some studies for larger medical journals, but they may not be applicable to us: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5213965/#CR12 .

I will not be religious about stated amount of difference, but I hope the above provides some thoughts for future discussion.

I am intrigued by one thought, that perhaps one can let the author’s information be available but not immediately so. For example, one may have to ask Easychair before the information will be displayed, so, it can still be used when considering subreviewerss, but the information can be prompted forgotten afterwards. Having the information in the smallest possible font hidden as a footnote on the last page may work too. In any case, I do not see any reason for names and institutes of authors to be display prominently on the submission (I have personally often removed institute information on the 3-page abstracts, in the least that saves space for the actual science).

  • COI is tricky. We do not want people too close to the authors to review, but we also want strong expertise. A more radical view is that, out of 3 reviews, allowing up to one review with a minor COI (same institute or a recent collaborator in exchange of more expertise) can be valuable; since the “majority” of the reviews are still arm’s length, and biases will show up quite clearly. How interesting a subject is and how incremental the result looks like vary a lot from one small subcommunity to another. I think having opinions and discussions from a more diverse group of reviewers can be useful. Given the ridiculous number of subjects the PC has to cover, there may be very few experts of a subject, and excluding them due to minor COIs can result in a less fair process.

  • I second the policy that subreviewers should not enter scores.

  • Papers with no bids probably can be assigned to one dedicated reviewer, who can then decide if further reviews are needed.

  • I agree the moderator should not be one of the reviewers.

  • I will include an additional issue, from a discussion with a former PC chair. PC members seem to vary a lot in their level of “generosity”. Some PC members were said to have accepted up 50% of the assignments, while the instruction was to accept paper in par with the better half of the previously presented works,. I wonder if the variation was exceptional or a norm; if so, whether there are methods to mitigate the problem.

  • I think there should be clear statement what community QIP is serving, and what are the goals. The field is shifting so much in just the last 2-3 years … Perhaps the scientific committee can initiate some thoughts and a more thorough discussion be brought up in the next 1-2 QIPs involving also former attendees who may be absent.

  • Cost should be kept low, and there is more value to a more open event with more social interaction (like the rump session and poster sessions) them a sat-down evening with pre-scheduled speeches or performances. Given that people stay for 4-5 nights at best, with a potential evening for industrial expo, and possibly one rump session, there may be too little time for attendees to explore collaborations etc.

Leave a comment